Re: [PATCH] i2c: rk3x: init module as subsys call

2016-01-05 Thread Heiko Stuebner
Hi Tao,

Am Dienstag, 5. Januar 2016, 15:42:32 schrieb Huang, Tao:
> On 2016年01月05日 15:02, Heiko Stuebner wrote:
> > Hi Jianqun,
> > 
> > Am Dienstag, 5. Januar 2016, 11:02:18 schrieb jianqun.xu:
> >> From: Xu Jianqun 
> >> 
> >> There is a requirement from pmic device, which is on the i2c bus,
> >> that the pmic needs to be called earlier then devices powered by
> >> the outputs of the pmic, if not, the devices maybe fail to probe.
> >> 
> >> For example, a pmic on i2c0, and touchscreen device on i2c2,
> >> i2c0: - pmic(rk818)
> >> i2c2: - ts(gt911), powered by rk818 on i2c0
> >> 
> >> The problem will happen if the i2c2 node in dts file is ordered
> >> before i2c0 node, then ts(gt911) will be probed before pmic(rk818),
> >> since the power from the pmic(rk818) for ts(gt911) hasn't enabled,
> >> so ts(gt911) will fail to probe due to the failure of i2c test.
> >> 
> >> But if we set the i2c0 node before i2c2, there is no this issue.
> >> 
> >> The stable way to make sure that pmic can be intalized before other
> >> peripher devices is to make the pmic module be subsys call, the i2c
> >> module need to be subsys call firstly.
> > 
> > I do believe that came up in the past already and the direction from
> > then
> > was (and most likely still is) that drivers should make use of the
> > probe-
> > deferral mechanism instead of wiggling with the initcall ordering.
> 
> I don't think this is a good idea. This will trigger a lots of init call
> failed. Before pmic init, all i2c device driver transmit will failed,
> and because i2c is slow bus, and i2c transmit may failed by other
> reasons, so the i2c driver and i2c device driver will try many times to
> make sure the transmit completion. These unnecessary transmission will
> make Linux boot very slow.

In general, the slowdown won't be _this_ much if touchscreen drivers need 
one deferral-round before i2c is available. I'm also only pointing out 
things I remember from the last time this came up. 

rk3x-i2c even was here already:
http://www.spinics.net/lists/linux-i2c/msg16680.html


> I2C bus should be subsys, and we can easy resolve this problem, why we
> depends on a complicated and slow implementation?

because it's the only safe way to do that. Because now you need i2c-init at 
subsys-init time, some months later some other soc may need some other 
ordering, especially needing i2c-init later/earlier.

Going through the deferral mechanism is the only way currently available to 
actually make this work on all socs.

Tomeu from Collabora is working on some better scheme to optimize device 
probing order but it looks like this may be a bit off still.


> > Your touchscreen will have a "xyz-supply" property and I think the
> > regulator-framework should already emit a -EPROBE_DEFER at
> > regulator_get,
> > when the regulator is specified but not available yet.
> 
> Unfortunately, mostly driver do not support regulator api. They are
> suppose power is on.

Having touchscreen drivers support its proper supply-regulators is not 
rocket science ;-) [0] , so I would consider this a bug in the touchscreen 
driver itself.

Just look into the datasheet and add the appropriate supplies to the drivers 
in question.


Heiko

[0] citiing my own work: 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/input/touchscreen/zforce_ts.c#n806

> 
> Thanks.
> Huang, Tao
> 
> > Heiko
> > 
> >> Signed-off-by: Xu Jianqun 
> >> ---
> >> 
> >>  drivers/i2c/busses/i2c-rk3x.c | 12 +++-
> >>  1 file changed, 11 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/i2c/busses/i2c-rk3x.c
> >> b/drivers/i2c/busses/i2c-rk3x.c index c1935eb..00e5959 100644
> >> --- a/drivers/i2c/busses/i2c-rk3x.c
> >> +++ b/drivers/i2c/busses/i2c-rk3x.c
> >> @@ -1037,7 +1037,17 @@ static struct platform_driver rk3x_i2c_driver =
> >> {
> >> 
> >>},
> >>  
> >>  };
> >> 
> >> -module_platform_driver(rk3x_i2c_driver);
> >> +static int __init rk3x_i2c_init(void)
> >> +{
> >> +  return platform_driver_register(&rk3x_i2c_driver);
> >> +}
> >> +subsys_initcall(rk3x_i2c_init);
> >> +
> >> +static void __exit rk3x_i2c_exit(void)
> >> +{
> >> +  platform_driver_unregister(&rk3x_i2c_driver);
> >> +}
> >> +module_exit(rk3x_i2c_exit);
> >> 
> >>  MODULE_DESCRIPTION("Rockchip RK3xxx I2C Bus driver");
> >>  MODULE_AUTHOR("Max Schwarz ");

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 06/32] s390: reuse asm-generic/barrier.h

2016-01-05 Thread Martin Schwidefsky
On Mon, 4 Jan 2016 22:42:44 +0200
"Michael S. Tsirkin"  wrote:

> On Mon, Jan 04, 2016 at 04:03:39PM +0100, Martin Schwidefsky wrote:
> > On Mon, 4 Jan 2016 14:20:42 +0100
> > Peter Zijlstra  wrote:
> > 
> > > On Thu, Dec 31, 2015 at 09:06:30PM +0200, Michael S. Tsirkin wrote:
> > > > On s390 read_barrier_depends, smp_read_barrier_depends
> > > > smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
> > > > asm-generic variants exactly. Drop the local definitions and pull in
> > > > asm-generic/barrier.h instead.
> > > > 
> > > > This is in preparation to refactoring this code area.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin 
> > > > Acked-by: Arnd Bergmann 
> > > > ---
> > > >  arch/s390/include/asm/barrier.h | 10 ++
> > > >  1 file changed, 2 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/arch/s390/include/asm/barrier.h 
> > > > b/arch/s390/include/asm/barrier.h
> > > > index 7ffd0b1..c358c31 100644
> > > > --- a/arch/s390/include/asm/barrier.h
> > > > +++ b/arch/s390/include/asm/barrier.h
> > > > @@ -30,14 +30,6 @@
> > > >  #define smp_rmb()  rmb()
> > > >  #define smp_wmb()  wmb()
> > > >  
> > > > -#define read_barrier_depends() do { } while (0)
> > > > -#define smp_read_barrier_depends() do { } while (0)
> > > > -
> > > > -#define smp_mb__before_atomic()smp_mb()
> > > > -#define smp_mb__after_atomic() smp_mb()
> > > 
> > > As per:
> > > 
> > >   lkml.kernel.org/r/20150921112252.3c2937e1@mschwide
> > > 
> > > s390 should change this to barrier() instead of smp_mb() and hence
> > > should not use the generic versions.
> >  
> > Yes, we wanted to simplify this. Thanks for the reminder, I'll queue
> > a patch.
> 
> Could you base on my patchset maybe, to avoid conflicts,
> and I'll merge it?
> Or if it's just replacing these 2 with barrier() I can do this
> myself easily.

Probably the easiest solution if you do the patch yourself and
include it in your patch set. 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Propose] Isolate core_pattern in mnt namespace.

2016-01-05 Thread Eric W. Biederman
Dongsheng Yang  writes:

> On 12/24/2015 12:36 AM, Eric W. Biederman wrote:
>> Dongsheng Yang  writes:
> [...]
>
> Hi Eric,
>   Happy new year and sorry for the late reply.
>>
>> Given the other constraints on an implementation the pid namespace looks
>> by far the one best suited to host such a sysctl if it is possible to
>> implement safely.
>
> So you think it's better to isolate the core_pattern in pid_namespace,
> am I right?

Roughly.

> But, core_file_path and user_mode_helper_path in core_pattern are much
> more related with mnt_namespace IMO.
>
> Could you help to explain it more?

You need a full complement of namespaces, to execute a user mode helper.

Really roughly you need a namespaced equivalent of kthreadd, with a full
complement of namespaces and cgroups setup in the container.

Further it is necessary to have a clear rule that says which processes
that dump core are affected.For a hierarchical pid namespace this is
straight forward.  For a mount namespace I don't know how that could be
implemented.

And yes the whole kthreadd thing that user mode helper does to launch a
task is necessary to have a clean and predicatable environment.

Of course the default rule of dropping a file named core in the current
directory of the process that died works for everyone, with no kernel
modifications needed.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/8 v6] thermal: rcar: enable to use thermal-zone on DT

2016-01-05 Thread Kuninori Morimoto

Hi

ping ?

> From: Kuninori Morimoto 
> 
> This patch enables to use thermal-zone on DT if it was calles as
> "renesas,rcar-thermal-gen2".
> Previous style (= non thermal-zone) is still supported by
> "renesas,rcar-thermal" to keep compatibility for "git bisect".
> 
> Signed-off-by: Kuninori Morimoto 
> ---
> v5 -> v6
> 
>  - "was call" -> "was called"
>  - add reason why it keeps previous style
> 
>  .../devicetree/bindings/thermal/rcar-thermal.txt   | 37 +-
>  drivers/thermal/rcar_thermal.c | 45 
> +++---
>  2 files changed, 75 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/thermal/rcar-thermal.txt 
> b/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> index 332e625..e5ee3f1 100644
> --- a/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> +++ b/Documentation/devicetree/bindings/thermal/rcar-thermal.txt
> @@ -1,8 +1,9 @@
>  * Renesas R-Car Thermal
>  
>  Required properties:
> -- compatible : "renesas,thermal-", "renesas,rcar-thermal"
> -   as fallback.
> +- compatible : "renesas,thermal-",
> +"renesas,rcar-gen2-thermal" (with thermal-zone) or
> +"renesas,rcar-thermal" (without thermal-zone) as 
> fallback.
> Examples with soctypes are:
>   - "renesas,thermal-r8a73a4" (R-Mobile APE6)
>   - "renesas,thermal-r8a7779" (R-Car H1)
> @@ -36,3 +37,35 @@ thermal@e61f {
>   0xe61f0300 0x38>;
>   interrupts = <0 69 IRQ_TYPE_LEVEL_HIGH>;
>  };
> +
> +Example (with thermal-zone):
> +
> +thermal-zones {
> + cpu_thermal: cpu-thermal {
> + polling-delay-passive   = <1000>;
> + polling-delay   = <5000>;
> +
> + thermal-sensors = <&thermal>;
> +
> + trips {
> + cpu-crit {
> + temperature = <115000>;
> + hysteresis  = <0>;
> + type= "critical";
> + };
> + };
> + cooling-maps {
> + };
> + };
> +};
> +
> +thermal: thermal@e61f {
> + compatible ="renesas,thermal-r8a7790",
> + "renesas,rcar-gen2-thermal",
> + "renesas,rcar-thermal";
> + reg = <0 0xe61f 0 0x14>, <0 0xe61f0100 0 0x38>;
> + interrupts = <0 69 IRQ_TYPE_LEVEL_HIGH>;
> + clocks = <&mstp5_clks R8A7790_CLK_THERMAL>;
> + power-domains = <&cpg_clocks>;
> + #thermal-sensor-cells = <0>;
> +};
> diff --git a/drivers/thermal/rcar_thermal.c b/drivers/thermal/rcar_thermal.c
> index 30602f2..e92f29b 100644
> --- a/drivers/thermal/rcar_thermal.c
> +++ b/drivers/thermal/rcar_thermal.c
> @@ -23,6 +23,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -75,8 +76,10 @@ struct rcar_thermal_priv {
>  #define rcar_has_irq_support(priv)   ((priv)->common->base)
>  #define rcar_id_to_shift(priv)   ((priv)->id * 8)
>  
> +#define USE_OF_THERMAL   1
>  static const struct of_device_id rcar_thermal_dt_ids[] = {
>   { .compatible = "renesas,rcar-thermal", },
> + { .compatible = "renesas,rcar-gen2-thermal", .data = (void 
> *)USE_OF_THERMAL },
>   {},
>  };
>  MODULE_DEVICE_TABLE(of, rcar_thermal_dt_ids);
> @@ -200,9 +203,9 @@ err_out_unlock:
>   return ret;
>  }
>  
> -static int rcar_thermal_get_temp(struct thermal_zone_device *zone, int *temp)
> +static int rcar_thermal_get_current_temp(struct rcar_thermal_priv *priv,
> +  int *temp)
>  {
> - struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
>   int tmp;
>   int ret;
>  
> @@ -226,6 +229,20 @@ static int rcar_thermal_get_temp(struct 
> thermal_zone_device *zone, int *temp)
>   return 0;
>  }
>  
> +static int rcar_thermal_of_get_temp(void *data, int *temp)
> +{
> + struct rcar_thermal_priv *priv = data;
> +
> + return rcar_thermal_get_current_temp(priv, temp);
> +}
> +
> +static int rcar_thermal_get_temp(struct thermal_zone_device *zone, int *temp)
> +{
> + struct rcar_thermal_priv *priv = rcar_zone_to_priv(zone);
> +
> + return rcar_thermal_get_current_temp(priv, temp);
> +}
> +
>  static int rcar_thermal_get_trip_type(struct thermal_zone_device *zone,
> int trip, enum thermal_trip_type *type)
>  {
> @@ -282,6 +299,10 @@ static int rcar_thermal_notify(struct 
> thermal_zone_device *zone,
>   return 0;
>  }
>  
> +static const struct thermal_zone_of_device_ops rcar_thermal_zone_of_ops = {
> + .get_temp   = rcar_thermal_of_get_temp,
> +};
> +
>  static struct thermal_zone_device_ops rcar_thermal_zone_ops = {
>   .get_temp   = rcar_thermal_get_temp,
>   .get_trip_type  = rcar_thermal_get_trip_type,
> @@ -318,14 +339,20 @@ static voi

Re: arch/mips/vdso/gettimeofday.c:1:0: error: '-march=r3900' requires '-mfp32'

2016-01-05 Thread Guenter Roeck

On 01/04/2016 11:23 PM, kbuild test robot wrote:

Hi Guenter,

First bad commit (maybe != root cause):

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   168309855a7d1e16db751e9c647119fe2d2dc878
commit: 398c7500a1f5f74e207bd2edca1b1721b3cc1f1e MIPS: VDSO: Fix build error 
with binutils 2.24 and earlier
date:   6 days ago
config: mips-jmr3927_defconfig (attached as .config)
reproduce:
 wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
 chmod +x ~/bin/make.cross
 git checkout 398c7500a1f5f74e207bd2edca1b1721b3cc1f1e
 # save the attached .config to linux build tree
 make.cross ARCH=mips

All errors (new ones prefixed by >>):


arch/mips/vdso/gettimeofday.c:1:0: error: '-march=r3900' requires '-mfp32'

 /*
 ^


AFAICS this is using the mips cross compiler version 4.9.0 from kernel.org [1],
which in turn uses binutils 2.24. At least this is what make.cross tries to 
install.

I downloaded it and used it to compile both the attached configuration as well
as jmr3927_defconfig. Both are building just fine for me. On top of that,
arch/mips/vdso/gettimeofday.c should not build in the first place with binutils 
2.24
(and doesn't build for me).

What am I missing ?

Guenter

---
[1] 
https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.9.0/x86_64-gcc-4.9.0-nolibc_mips-linux.tar.xz

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Martin Schwidefsky
On Mon, 4 Jan 2016 22:18:58 +0200
"Michael S. Tsirkin"  wrote:

> On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote:
> > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> > > This defines __smp_xxx barriers for s390,
> > > for use by virtualization.
> > > 
> > > Some smp_xxx barriers are removed as they are
> > > defined correctly by asm-generic/barriers.h
> > > 
> > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
> > > unconditionally on this architecture.
> > > 
> > > Signed-off-by: Michael S. Tsirkin 
> > > Acked-by: Arnd Bergmann 
> > > ---
> > >  arch/s390/include/asm/barrier.h | 15 +--
> > >  1 file changed, 9 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/arch/s390/include/asm/barrier.h 
> > > b/arch/s390/include/asm/barrier.h
> > > index c358c31..fbd25b2 100644
> > > --- a/arch/s390/include/asm/barrier.h
> > > +++ b/arch/s390/include/asm/barrier.h
> > > @@ -26,18 +26,21 @@
> > >  #define wmb()barrier()
> > >  #define dma_rmb()mb()
> > >  #define dma_wmb()mb()
> > > -#define smp_mb() mb()
> > > -#define smp_rmb()rmb()
> > > -#define smp_wmb()wmb()
> > > -
> > > -#define smp_store_release(p, v)  
> > > \
> > > +#define __smp_mb()   mb()
> > > +#define __smp_rmb()  rmb()
> > > +#define __smp_wmb()  wmb()
> > > +#define smp_mb() __smp_mb()
> > > +#define smp_rmb()__smp_rmb()
> > > +#define smp_wmb()__smp_wmb()
> > 
> > Why define the smp_*mb() primitives here? Would not the inclusion of
> > asm-generic/barrier.h do this?
> 
> No because the generic one is a nop on !SMP, this one isn't.
> 
> Pls note this patch is just reordering code without making
> functional changes.
> And at the moment, on s390 smp_xxx barriers are always non empty.

The s390 kernel is SMP to 99.99%, we just didn't bother with a
non-smp variant for the memory-barriers. If the generic header
is used we'd get the non-smp version for free. It will save a
small amount of text space for CONFIG_SMP=n. 
 
> Some of this could be sub-optimal, but
> since on s390 Linux always runs on a hypervisor,
> I am not sure it's safe to use the generic version -
> in other words, it just might be that for s390 smp_ and virt_
> barriers must be equivalent.

The definition of the memory barriers is independent from the fact
if the system is running on an hypervisor or not. Is there really
an architecture where you need special virt_xxx barriers?!? 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: thinkpad_acpi: BUG: unable to handle kernel NULL pointer dereference

2016-01-05 Thread Pali Rohár
Hello,

looks like this fault is in acpi video module, not in thinkpad. CCing
Hans who introduced this acpi video brightness key change.

Hans, can you look at this bug?

On Monday 04 January 2016 18:33:47 Jeremiah Mahler wrote:
> all,
> 
> Just tried linux-next 20160104 on a Lenovo Carbon X1 and I got a BUG
> message about a NULL pointer dereference.  There is also a WARNING about
> a mutex (see below).  It looks like it might be related to something in
> the thinkpad_acpi module.
> 
> [...]
> [2.374627] thinkpad_acpi: ThinkPad ACPI Extras v0.25
> [2.374630] thinkpad_acpi: http://ibm-acpi.sf.net/
> [2.374632] thinkpad_acpi: ThinkPad BIOS G6ET59WW (2.03 ), EC unknown
> [2.374633] thinkpad_acpi: Lenovo ThinkPad X1 Carbon, model 3443CTO
> [2.375176] thinkpad_acpi: Unsupported brightness interface
> [2.375303] thinkpad_acpi: radio switch found; radios are enabled
> [2.375317] [ cut here ]
> [2.375384] WARNING: CPU: 1 PID: 303 at kernel/locking/mutex.c:526 
> __mutex_lock_slowpath+0x2d8/0x2f0()
> [2.375453] DEBUG_LOCKS_WARN_ON(l->magic != l)
> [2.375503] Modules linked in:
> [2.375592]  battery thinkpad_acpi(+) mei_me snd_timer nvram ac snd mei 
> video button tpm_tis(+) i2c_i801 shpchp soundcore lpc_ich tpm mfd_core 
> i2c_core intel_smartconnect btusb btbcm btintel bluetooth rfkill loop ipv6 
> autofs4 ext4 crc16 mbcache jbd2 sd_mod ahci libahci libata scsi_mod sdhci_pci 
> sdhci xhci_pci mmc_core xhci_hcd ehci_pci ehci_hcd usbcore usb_common thermal
> [2.377790] CPU: 1 PID: 303 Comm: systemd-udevd Not tainted 
> 4.4.0-rc7-next-20160104+ #2
> [2.377888] Hardware name: LENOVO 3443CTO/3443CTO, BIOS G6ET59WW (2.03 ) 
> 09/11/2012
> [2.377984]  81722534 812d0967 880035e07ae0 
> 8106ad7d
> [2.378269]  a0381360 880035e07b30 a0381368 
> 8801181ef100
> [2.378550]  a03ea348 8106adfc 8172251c 
> 0020
> [2.378817] Call Trace:
> [2.378895]  [] ? dump_stack+0x40/0x59
> [2.378971]  [] ? warn_slowpath_common+0x7d/0xb0
> [2.379033]  [] ? warn_slowpath_fmt+0x4c/0x50
> [2.379109]  [] ? kernfs_add_one+0x103/0x160
> [2.379185]  [] ? __mutex_lock_slowpath+0x2d8/0x2f0
> [2.379262]  [] ? mutex_lock+0x16/0x30
> [2.379343]  [] ? 
> acpi_video_handles_brightness_key_presses+0x12/0x40 [video]
> [2.379427]  [] ? hotkey_init+0x5aa/0x716 [thinkpad_acpi]
> [2.379502]  [] ? 
> thinkpad_acpi_module_init.part.32+0x5f6/0x925 [thinkpad_acpi]
> [2.379598]  [] ? 
> thinkpad_acpi_module_init.part.32+0x925/0x925 [thinkpad_acpi]
> [2.379693]  [] ? thinkpad_acpi_module_init+0x352/0x8cf 
> [thinkpad_acpi]
> [2.379799]  [] ? free_pcppages_bulk+0xbb/0x480
> [2.379871]  [] ? do_one_initcall+0xb2/0x200
> [2.379947]  [] ? do_init_module+0x5b/0x1e0
> [2.380012]  [] ? load_module+0x220e/0x2810
> [2.380079]  [] ? __symbol_put+0x30/0x30
> [2.380147]  [] ? SyS_finit_module+0x90/0xc0
> [2.380219]  [] ? entry_SYSCALL_64_fastpath+0x16/0x71
> [2.380296] ---[ end trace 60661306144c0866 ]---
> [2.380374] BUG: unable to handle kernel NULL pointer dereference at   
> (null)
> [2.380552] IP: [] __mutex_lock_slowpath+0xd5/0x2f0
> [2.380682] PGD 0 
> [2.380795] Oops: 0002 [#1] SMP 
> [2.380968] Modules linked in: battery thinkpad_acpi(+) mei_me snd_timer 
> nvram ac snd mei video button tpm_tis(+) i2c_i801 shpchp soundcore lpc_ich 
> tpm mfd_core i2c_core intel_smartconnect btusb btbcm btintel bluetooth rfkill 
> loop ipv6 autofs4 ext4 crc16 mbcache jbd2 sd_mod ahci libahci libata scsi_mod 
> sdhci_pci sdhci xhci_pci mmc_core xhci_hcd ehci_pci ehci_hcd usbcore 
> usb_common thermal
> [2.383278] CPU: 1 PID: 303 Comm: systemd-udevd Tainted: GW   
> 4.4.0-rc7-next-20160104+ #2
> [2.383358] Hardware name: LENOVO 3443CTO/3443CTO, BIOS G6ET59WW (2.03 ) 
> 09/11/2012
> [2.383455] task: 8801181ef100 ti: 880035e04000 task.ti: 
> 880035e04000
> [2.383547] RIP: 0010:[]  [] 
> __mutex_lock_slowpath+0xd5/0x2f0
> [2.383634] ACPI: Battery Slot [BAT0] (battery present)
> [2.383706] RSP: 0018:880035e07b40  EFLAGS: 00010002
> [2.383761] RAX:  RBX: a0381360 RCX: 
> a0381380
> [2.383820] RDX:  RSI: 880035e07b50 RDI: 
> a0381360
> [2.383874] RBP: 880035e07ba0 R08:  R09: 
> 02c4
> [2.383924] R10: 81a85600 R11: 02c4 R12: 
> a0381368
> [2.383975] R13: 8801181ef100 R14:  R15: 
> 0246
> [2.384026] FS:  7f1c1743a8c0() GS:88011e28() 
> knlGS:
> [2.384087] CS:  0010 DS:  ES:  CR0: 80050033
> [2.384136] CR2:  CR3: 35fa6000 CR4: 
> 001406e0
> [2.384186] Stack:
> [2.384230]  8800c76612f8 a0

Re: arch/mips/vdso/gettimeofday.c:1:0: error: '-march=r3900' requires '-mfp32'

2016-01-05 Thread Fengguang Wu
On Tue, Jan 05, 2016 at 12:09:14AM -0800, Guenter Roeck wrote:
> On 01/04/2016 11:23 PM, kbuild test robot wrote:
> >Hi Guenter,
> >
> >First bad commit (maybe != root cause):
> >
> >tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> >master
> >head:   168309855a7d1e16db751e9c647119fe2d2dc878
> >commit: 398c7500a1f5f74e207bd2edca1b1721b3cc1f1e MIPS: VDSO: Fix build error 
> >with binutils 2.24 and earlier
> >date:   6 days ago
> >config: mips-jmr3927_defconfig (attached as .config)
> >reproduce:
> > wget 
> > https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
> >  -O ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > git checkout 398c7500a1f5f74e207bd2edca1b1721b3cc1f1e
> > # save the attached .config to linux build tree
> > make.cross ARCH=mips
> >
> >All errors (new ones prefixed by >>):
> >
> >>>arch/mips/vdso/gettimeofday.c:1:0: error: '-march=r3900' requires '-mfp32'
> > /*
> > ^
> >
> AFAICS this is using the mips cross compiler version 4.9.0 from kernel.org 
> [1],
> which in turn uses binutils 2.24. At least this is what make.cross tries to 
> install.

Oops, sorry. I'm now using the debian MIPS cross compiler 5.2.1 ...
make.cross has not been updated yet.

> I downloaded it and used it to compile both the attached configuration as well
> as jmr3927_defconfig. Both are building just fine for me. On top of that,
> arch/mips/vdso/gettimeofday.c should not build in the first place with 
> binutils 2.24
> (and doesn't build for me).
> 
> What am I missing ?
> 
> Guenter
> 
> ---
> [1] 
> https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.9.0/x86_64-gcc-4.9.0-nolibc_mips-linux.tar.xz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Reading the same block via partition and non-partitioned device gives different content

2016-01-05 Thread Andrei Borzenkov
[Please Cc me on reply; thank you]

QEMU KVM virtual machine with openSUSE Tumbleweed (kernel
4.3.3-3-default); MD RAID1 with 1.2 metadata on /dev/vdb1 and /dev/vdc1.

If I do

mdadm /dev/mdX --fail /dev/vdb1
mdadm /dev/mdX --add /dev/vdd1

and wait for synchronization to finish and then look directly on on-disk
suportblock, I see different content whether I read from /dev/vdd or
from /dev/vdd1.

/dev/vdd1 claims new disk is still spare (i.e. it is apparently
immediately after adding it).

The *same* superblock when read from /dev/vdd (of course with
appropriate offset) correctly marks /dev/vdd1 as RAID member. The same
content also seen when looking from host.

I hit this when debugging problem with grub2 that scans devices for
Linux MD (it is using the same code both at boot and run time). It
skipped replacement disk because it believed disk was spare.

Is it expected behavior? Note that if we now replace /dev/vdc1 with
something else we have "wrong" superblock on both partitions so grub
fails to find array completely. Fortunately this is transient state, but
it also means it is impossible to reconfigure grub until reboot.
Alternatively shutting down and restarting array cleats it as well.

Any trick we can use to force content to be the same in this state?

-andrei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rsi: Delete unnecessary variable initialisations in rsi_send_mgmt_pkt()

2016-01-05 Thread SF Markus Elfring
> That said, if you figure out some change that produces significant
> reductions in code or binary size on multiple architectures without
> making things more complicated, less readable or making the code or
> binary size larger, then by all means propose it.

Are you looking also for "a proof" that such changes are worthwhile?


> "This makes things smaller" carries much more weight than
> "I think this is better".

Can the discussed implementation of a function like "rsi_send_mgmt_pkt"
become a bit smaller by the deletion of extra variable initialisations


> Almost all of the changes you've proposed that have seen any
> discussion whatsoever fall into the latter category.

Thanks for your interesting feedback.

Can a further constructive dialogue evolve from the presented information?

Regards,
Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND PATCH] rtc: rk808: rename rtc-rk808.c to rtc-rk8xx.c

2016-01-05 Thread Huang, Tao
Hi, Alessandro:

On 2016年01月04日 21:59, Alessandro Zummo wrote:
> On Mon, 4 Jan 2016 10:45:46 +0100
> Alexandre Belloni  wrote:
> 
>> I'm not sure it is useful to do that renaming. It is usual to have one
>> driver that supports multiple chips named with the forst chip it
>> supported.
>>
>> Also, what would happen if for example rk855 is not compatible at all
>> with the previous implementations?
> 
>  Alexandre is absolutely right. There's no need to rename a driver,
>  it would just piss off people who are used to that name and
>  have it in their scripts. Like when your eth0 gets renamed
>  to some obscure enXXX .
> 

You and Alexandre are right. The rename is just make the driver more
readable, i.e. let people know this driver suit for more PMIC no just
rk808. In fact, I don't care the name is rk808 or rk8xx.

The key change of this patch is try to dis-coupling rk808 driver and RTC
driver. Because of register offset and function is vary between
different PMIC, we believe it is hard to write one PMIC driver to suit
all PMIC. So we hope RTC driver can share between all PMIC from rockchip.

Please review this code:

-static int rk808_rtc_probe(struct platform_device *pdev)
+static int rk8xx_rtc_probe(struct platform_device *pdev)
 {
-   struct rk808 *rk808 = dev_get_drvdata(pdev->dev.parent);
...
+   struct i2c_client *client = to_i2c_client(pdev->dev.parent);

...

-   rk808_rtc->rk808 = rk808;
+   rk8xx_rtc->regmap = devm_regmap_init_i2c(client,
+&rk8xx_rtc_regmap_config);
...
+   rk8xx_rtc->i2c = client;

Old driver have struct rk808 pointer, which defined on
include/linux/mfd/rk808.h
If we write new PMIC driver, for example rk818, define a new struct
rk818. How can we get this pointer from RTC driver?

So another way to solve this problem is introduce common struct share
between all PMIC driver. For example rk8xx.

We solve this problem by create new regmap to access PMIC. As I say
before, it make RTC driver independent of PMIC driver. Do you agree this
change?

Thanks!
Huang, Tao

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] base/platform: Fix platform drivers with no probe callback (ex alarmtimer)

2016-01-05 Thread Uwe Kleine-König
Hello,

I think this is the same problem that another Martin found and fixed in

http://mid.gmane.org/1449132704-9952-1-git-send-email-martin.wi...@ts.fujitsu.com

I didn't check, but thought Greg already picked that up?!

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i2c: rk3x: init module as subsys call

2016-01-05 Thread Huang, Tao
Hi, Heiko:

On 2016年01月05日 16:00, Heiko Stuebner wrote:
> Hi Tao,
> 
> Am Dienstag, 5. Januar 2016, 15:42:32 schrieb Huang, Tao:

>> I don't think this is a good idea. This will trigger a lots of init call
>> failed. Before pmic init, all i2c device driver transmit will failed,
>> and because i2c is slow bus, and i2c transmit may failed by other
>> reasons, so the i2c driver and i2c device driver will try many times to
>> make sure the transmit completion. These unnecessary transmission will
>> make Linux boot very slow.
> 
> In general, the slowdown won't be _this_ much if touchscreen drivers need 
> one deferral-round before i2c is available. I'm also only pointing out 
> things I remember from the last time this came up. 
> 
> rk3x-i2c even was here already:
> http://www.spinics.net/lists/linux-i2c/msg16680.html

OK. I don't agree with the rule, but we will follow it.

> 
> 
>> I2C bus should be subsys, and we can easy resolve this problem, why we
>> depends on a complicated and slow implementation?
> 
> because it's the only safe way to do that. Because now you need i2c-init at 
> subsys-init time, some months later some other soc may need some other 
> ordering, especially needing i2c-init later/earlier.
> 
> Going through the deferral mechanism is the only way currently available to 
> actually make this work on all socs.
> 
> Tomeu from Collabora is working on some better scheme to optimize device 
> probing order but it looks like this may be a bit off still.
> 
> 
>>> Your touchscreen will have a "xyz-supply" property and I think the
>>> regulator-framework should already emit a -EPROBE_DEFER at
>>> regulator_get,
>>> when the regulator is specified but not available yet.
>>
>> Unfortunately, mostly driver do not support regulator api. They are
>> suppose power is on.
> 
> Having touchscreen drivers support its proper supply-regulators is not 
> rocket science ;-) [0] , so I would consider this a bug in the touchscreen 
> driver itself.

I don't just talk about touch screen driver, most i2c device driver such
as input sensor/camera/rtc/battery will suffer. So people will see their
drivers do not work or slow down on rk3368 platform :(

Thanks!
Huang, Tao

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Add possibility to set /dev/tty number

2016-01-05 Thread Pierre Paul MINGOT
2016-01-04 19:41 GMT+01:00 Austin S. Hemmelgarn :
> On 2016-01-04 12:11, Greg KH wrote:
>>
>> On Mon, Jan 04, 2016 at 11:57:33AM -0500, Austin S. Hemmelgarn wrote:
>>>
>>> On 2016-01-04 10:43, Greg KH wrote:

 On Mon, Jan 04, 2016 at 04:34:56PM +0100, Pierre Paul MINGOT wrote:
>
> Hello,
>
> In Linux there is no way to set the number of tty devices or console
> to create. By default the kernel create 64 /dev/tty devices. what is
> too much for embedded system with limited resources.


 Really?  How much memory does a vt device take up?
>>>
>>> On a device with a simple text mode console in 80x25, a minimum of 2000
>>> bytes, not including anything used for character attributes, and anything
>>> else needed for the display and updating of the screen (I think I worked
>>> out
>>> once that it comes out to about 8k).  On my laptop which has a 1920x1080
>>> screen, using the standard 8x16 VGA font with a framebuffer console via
>>> i915, I get a 200x67 terminal size, which means that just the text
>>> without
>>> any attributes works out to a little more than 13k.  That gets doubled
>>> just
>>> by adding color, and probably doubled again for the other display
>>> attributes.  All of this also doesn't factor in the space taken up in
>>> devtmpfs and sysfs by the associated files (it's not much, but it's still
>>> wasted space).
>>
>>
>> If the console isn't initialized by userspace, is any of that space
>> still really being used?  Have you tried that?
>
> I'm pretty certain that most of the space that gets taken up by the
> scrollback buffer and screen isn't directly used unless the console is used,
> but there are still structures that get allocated at driver instantiation
> for each VT, including the device structures and such.
>>
>>
>>> That said, there are factors to consider other than just memory
>>> footprint:
>>> 1. Having 64 tty devices in /dev leads to somewhat cluttered listings (on
>>> most small systems I see, more than two thirds of the contents of /dev
>>> are
>>> tty device nodes).
>>
>>
>> Not having a cluttered /dev isn't the best reasoning here :)
>
> It wasn't intended as an argument on it's own, simply an additional point.
> It does have an impact though if you're dealing with something like a slow
> serial console, and it also looks _really_ odd having a bunch of device
> nodes for virtual devices that aren't used, and on most systems you can't
> get rid of at runtime (I've always been under the impression that having a
> dynamic /dev was primarily to avoid all the clutter you see there on systems
> like BSD (most derivatives of which still use a statically initialized
> /dev)).
>>
>>
>>> 2. Most people don't know how to switch to anything higher than about tty
>>> 15, a majority of people who have a graphical environment use at most 2
>>> VT's, and a lot of embedded systems use a fixed number of VT's that is
>>> known
>>> prior to full production.
>>
>>
>> Agreed, but does this actually take up memory?
>
> My point here was more that high numbered VT's are something that's pretty
> much unused on most systems, and therefore there is almost zero benefit for
> a majority of people.  At the very least it takes up space for the driver
> internal structures, and the stuff in sysfs.  While a few Kb of memory may
> not seem like much given that servers with close to 1Tb of RAM are starting
> to become common, it can still make a lot of difference in performance for a
> small embedded system.
>>
>>
>>> 3. There is some very poorly designed software out there (at least the
>>> original version of ConsoleKit, and I'd be willing to bet some
>>> third-party
>>> vendor software) which unconditionally starts a thread or process for
>>> each
>>> VT in the system.  While this software should be fixed to behave
>>> properly,
>>> it's infeasible for most end users to do this.
>>
>>
>> If we remove the number of devices, those "broken" userspace programs
>> will also break, so that implies that we should not allow this change.
>
> No, the software should just need to be recompiled (I've tested this with
> ConsoleKit, which also fails gracefully when it tires to open a tty device
> that doesn't exist), or adapted to dynamically detect the number of TTYs
> (like it should have in the first place for portability reasons).
>>
>>
>> Please provide some "real" numbers of memory savings please before
>> saying that this change really does save memory.  Just guessing isn't
>> ok.
>
> I can probably put something together to actually test this, but it will
> take a while (most of my testing scripts and VM's are targeted at regression
> testing of filesystems, not memory profiling of virtual device drivers). I
> doubt that it will work out to any more than 16k size difference, but that's
> still a few more pages (on most systems) that could be used for other
> things.


I totally agree with the points evoked by Austin. Nevertheless, the
interests of this patch are not  ON

Re: [PATCH v2 15/32] powerpc: define __smp_xxx

2016-01-05 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 09:36:55AM +0800, Boqun Feng wrote:
> Hi Michael,
> 
> On Thu, Dec 31, 2015 at 09:07:42PM +0200, Michael S. Tsirkin wrote:
> > This defines __smp_xxx barriers for powerpc
> > for use by virtualization.
> > 
> > smp_xxx barriers are removed as they are
> > defined correctly by asm-generic/barriers.h

I think this is the part that was missed in review.

> > This reduces the amount of arch-specific boiler-plate code.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > Acked-by: Arnd Bergmann 
> > ---
> >  arch/powerpc/include/asm/barrier.h | 24 
> >  1 file changed, 8 insertions(+), 16 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/barrier.h 
> > b/arch/powerpc/include/asm/barrier.h
> > index 980ad0c..c0deafc 100644
> > --- a/arch/powerpc/include/asm/barrier.h
> > +++ b/arch/powerpc/include/asm/barrier.h
> > @@ -44,19 +44,11 @@
> >  #define dma_rmb()  __lwsync()
> >  #define dma_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : : 
> > :"memory")
> >  
> > -#ifdef CONFIG_SMP
> > -#define smp_lwsync()   __lwsync()
> > +#define __smp_lwsync() __lwsync()
> >  
> 
> so __smp_lwsync() is always mapped to lwsync, right?

Yes.

> > -#define smp_mb()   mb()
> > -#define smp_rmb()  __lwsync()
> > -#define smp_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : : 
> > :"memory")
> > -#else
> > -#define smp_lwsync()   barrier()
> > -
> > -#define smp_mb()   barrier()
> > -#define smp_rmb()  barrier()
> > -#define smp_wmb()  barrier()
> > -#endif /* CONFIG_SMP */
> > +#define __smp_mb() mb()
> > +#define __smp_rmb()__lwsync()
> > +#define __smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > : :"memory")
> >  
> >  /*
> >   * This is a barrier which prevents following instructions from being
> > @@ -67,18 +59,18 @@
> >  #define data_barrier(x)\
> > asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
> >  
> > -#define smp_store_release(p, v)
> > \
> > +#define __smp_store_release(p, v)  
> > \
> >  do {   
> > \
> > compiletime_assert_atomic_type(*p); \
> > -   smp_lwsync();   \
> > +   __smp_lwsync(); \
> 
> , therefore this will emit an lwsync no matter SMP or UP.

Absolutely. But smp_store_release (without __) will not.

Please note I did test this: for ppc code before and after
this patch generates exactly the same binary on SMP and UP.


> Another thing is that smp_lwsync() may have a third user(other than
> smp_load_acquire() and smp_store_release()):
> 
> http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877
> 
> I'm OK to change my patch accordingly, but do we really want
> smp_lwsync() get involved in this cleanup? If I understand you
> correctly, this cleanup focuses on external API like smp_{r,w,}mb(),
> while smp_lwsync() is internal to PPC.
> 
> Regards,
> Boqun

I think you missed the leading ___ :)

smp_store_release is external and it needs __smp_lwsync as
defined here.

I can duplicate some code and have smp_lwsync *not* call __smp_lwsync
but why do this? Still, if you prefer it this way,
please let me know.

> > WRITE_ONCE(*p, v);  \
> >  } while (0)
> >  
> > -#define smp_load_acquire(p)
> > \
> > +#define __smp_load_acquire(p)  
> > \
> >  ({ \
> > typeof(*p) ___p1 = READ_ONCE(*p);   \
> > compiletime_assert_atomic_type(*p); \
> > -   smp_lwsync();   \
> > +   __smp_lwsync(); \
> > ___p1;  \
> >  })
> >  
> > -- 
> > MST
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] f2fs: check node id earily when readahead NAT page

2016-01-05 Thread Chao Yu
Add node id check in ra_node_page and get_node_page_ra like get_node_page.

Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 6d5f548..c1ddf3d 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1041,6 +1041,10 @@ void ra_node_page(struct f2fs_sb_info *sbi, nid_t nid)
struct page *apage;
int err;
 
+   if (!nid)
+   return;
+   f2fs_bug_on(sbi, check_nid_range(sbi, nid));
+
apage = find_get_page(NODE_MAPPING(sbi), nid);
if (apage && PageUptodate(apage)) {
f2fs_put_page(apage, 0);
@@ -1108,6 +1112,7 @@ struct page *get_node_page_ra(struct page *parent, int 
start)
nid = get_nid(parent, start, false);
if (!nid)
return ERR_PTR(-ENOENT);
+   f2fs_bug_on(sbi, check_nid_range(sbi, nid));
 repeat:
page = grab_cache_page(NODE_MAPPING(sbi), nid);
if (!page)
@@ -1127,9 +1132,9 @@ repeat:
end = start + MAX_RA_NODE;
end = min(end, NIDS_PER_BLOCK);
for (i = start + 1; i < end; i++) {
-   nid_t tnid = get_nid(parent, i, false);
-   if (!tnid)
-   continue;
+   nid_t tnid;
+
+   tnid = get_nid(parent, i, false);
ra_node_page(sbi, tnid);
}
 
-- 
2.6.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[STABLE] kernel oops which can be fixed by peterz's patches

2016-01-05 Thread Byungchul Park

Upstream commits to be applied
==

e3fca9e: sched: Replace post_schedule with a balance callback list
4c9a4bc: sched: Allow balance callbacks for check_class_changed()
8046d68: sched,rt: Remove return value from pull_rt_task()
fd7a4be: sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to 
balance callbacks
0ea60c2: sched,dl: Remove return value from pull_dl_task()
9916e21: sched, dl: Convert switched_{from, to}_dl() / prio_changed_dl() to 
balance callbacks

The reason why these should be applied
==

Our products developed using 3.16 kernel, faced a kernel oops which can
be fixed with above upstreamed patches. The oops is caused by "Unable
to handle kernel NULL pointer dereference at virtual address 00xx"
in the call path,

__sched_setscheduler()
check_class_changed()
switched_to_fair()
check_preempt_curr()
check_preempt_wakeup()
find_matching_se()
is_same_group()

by "if (se->cfs_rq == pse->cfs_rq) // se, pse == NULL" condition.

How to apply it
===

For stable 4.2.8+:
N/A (already applied)

For longterm 4.1.15:
Cherry-picking the upsteam commits works with a trivial conflict.

For longterm 3.18.25:
Refer to the backported patches in this thread.

For longterm 3.14.58:
Refer to the backported patches in this thread. And applying
additional "6c3b4d4: sched: Clean up idle task SMP logic" commit
makes backporting the upstream commits much simpler. So my
backporting patches include the patch.

For longterm 2.6.32.69 ~ 3.12.51: Need to be backported. (I didn't)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: perf test: Improbe bp_signal

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 04:58:00AM +, Wang Nan wrote:

SNIP

> diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
> index fb80c9e..0bc4f76 100644
> --- a/tools/perf/tests/bp_signal.c
> +++ b/tools/perf/tests/bp_signal.c
> @@ -29,14 +29,55 @@
>  
>  static int fd1;
>  static int fd2;
> +static int fd3;
>  static int overflows;
> +static int overflows_2;
> +
> +volatile long the_var;
> +
> +

please put comment in here explaning the assembly is used
to have watchpoint and breakpoint on single instruction

IIUC ;-)

thanks,
jirka

> +#if defined (__x86_64__)
> +extern void __test_function(volatile long *ptr);
> +asm (
> + ".globl __test_function\n"
> + "__test_function:\n"
> + "incq (%rdi)\n"
> + "ret\n");
> +#elif defined (__aarch64__)
> +extern void __test_function(volatile long *ptr);
> +asm (
> + ".globl __test_function\n"
> + "__test_function:\n"
> + "str x30, [x0]\n"
> + "ret\n");
> +
> +#else
> +static void __test_function(volatile long *ptr)
> +{
> + *ptr++;
> +}
> +#endif
>  
>  __attribute__ ((noinline))
>  static int test_function(void)
>  {
> + __test_function(&the_var);
> + the_var++;
>   return time(NULL);
>  }

SNIP
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] f2fs: introduce __get_node_page to reuse common code

2016-01-05 Thread Chao Yu
There are duplicated code in between get_node_page and get_node_page_ra,
introduce __get_node_page to includes common parts of these two, and
export get_node_page and get_node_page_ra by reusing __get_node_page.

Signed-off-by: Chao Yu 
---
 fs/f2fs/node.c | 88 +++---
 1 file changed, 35 insertions(+), 53 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index c1ddf3d..5a2d800 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -1060,56 +1060,35 @@ void ra_node_page(struct f2fs_sb_info *sbi, nid_t nid)
f2fs_put_page(apage, err ? 1 : 0);
 }
 
-struct page *get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid)
+/*
+ * readahead MAX_RA_NODE number of node pages.
+ */
+void ra_node_pages(struct page *parent, int start)
 {
-   struct page *page;
-   int err;
+   struct f2fs_sb_info *sbi = F2FS_P_SB(parent);
+   struct blk_plug plug;
+   int i, end;
+   nid_t nid;
 
-   if (!nid)
-   return ERR_PTR(-ENOENT);
-   f2fs_bug_on(sbi, check_nid_range(sbi, nid));
-repeat:
-   page = grab_cache_page(NODE_MAPPING(sbi), nid);
-   if (!page)
-   return ERR_PTR(-ENOMEM);
+   blk_start_plug(&plug);
 
-   err = read_node_page(page, READ_SYNC);
-   if (err < 0) {
-   f2fs_put_page(page, 1);
-   return ERR_PTR(err);
-   } else if (err == LOCKED_PAGE) {
-   goto page_hit;
+   /* Then, try readahead for siblings of the desired node */
+   end = start + MAX_RA_NODE;
+   end = min(end, NIDS_PER_BLOCK);
+   for (i = start; i < end; i++) {
+   nid = get_nid(parent, i, false);
+   ra_node_page(sbi, nid);
}
 
-   lock_page(page);
-
-   if (unlikely(!PageUptodate(page))) {
-   f2fs_put_page(page, 1);
-   return ERR_PTR(-EIO);
-   }
-   if (unlikely(page->mapping != NODE_MAPPING(sbi))) {
-   f2fs_put_page(page, 1);
-   goto repeat;
-   }
-page_hit:
-   f2fs_bug_on(sbi, nid != nid_of_node(page));
-   return page;
+   blk_finish_plug(&plug);
 }
 
-/*
- * Return a locked page for the desired node page.
- * And, readahead MAX_RA_NODE number of node pages.
- */
-struct page *get_node_page_ra(struct page *parent, int start)
+struct page *__get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid,
+   struct page *parent, int start)
 {
-   struct f2fs_sb_info *sbi = F2FS_P_SB(parent);
-   struct blk_plug plug;
struct page *page;
-   int err, i, end;
-   nid_t nid;
+   int err;
 
-   /* First, try getting the desired direct node. */
-   nid = get_nid(parent, start, false);
if (!nid)
return ERR_PTR(-ENOENT);
f2fs_bug_on(sbi, check_nid_range(sbi, nid));
@@ -1126,21 +1105,11 @@ repeat:
goto page_hit;
}
 
-   blk_start_plug(&plug);
-
-   /* Then, try readahead for siblings of the desired node */
-   end = start + MAX_RA_NODE;
-   end = min(end, NIDS_PER_BLOCK);
-   for (i = start + 1; i < end; i++) {
-   nid_t tnid;
-
-   tnid = get_nid(parent, i, false);
-   ra_node_page(sbi, tnid);
-   }
-
-   blk_finish_plug(&plug);
+   if (parent)
+   ra_node_pages(parent, start + 1);
 
lock_page(page);
+
if (unlikely(!PageUptodate(page))) {
f2fs_put_page(page, 1);
return ERR_PTR(-EIO);
@@ -1154,6 +1123,19 @@ page_hit:
return page;
 }
 
+struct page *get_node_page(struct f2fs_sb_info *sbi, pgoff_t nid)
+{
+   return __get_node_page(sbi, nid, NULL, 0);
+}
+
+struct page *get_node_page_ra(struct page *parent, int start)
+{
+   struct f2fs_sb_info *sbi = F2FS_P_SB(parent);
+   nid_t nid = get_nid(parent, start, false);
+
+   return __get_node_page(sbi, nid, parent, start);
+}
+
 void sync_inode_page(struct dnode_of_data *dn)
 {
if (IS_INODE(dn->node_page) || dn->inode_page == dn->node_page) {
-- 
2.6.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Nokia N900: Adjust MPU OPP values

2016-01-05 Thread Pali Rohár
On Saturday 02 January 2016 14:38:36 Nishanth Menon wrote:
> On 01/02/2016 11:16 AM, Tony Lindgren wrote:
> > * Pali Rohár  [160102 06:31]:
> >> Hello,
> >>
> >> MPU OPP table table (omap36xx_vddcore_volt_data) defined in
> >> opp3xxx_data.c does not match Nokia N900 phone. For a long time we have
> >> dirty patch in linux-n900 tree for it, see:
> >>
> >> https://github.com/pali/linux-n900/commit/4644c5801d7469e2be01d847c61df3d934dadd8c
> >>
> >> Now when doing transition to device tree, is there any way how correct
> >> MPU OOP table for Nokia N900 could be defined in DT file?
> > 
> > Hmm I'd assume we can just define this in the dts.. Nishanth, got
> > any comments on this one?
> > 
> 
> We already have definitions in dtb for omap3 OPPs. I think we should
> start using device tree as default. the oppxx_data.c is sticking around
> waiting for legacy boot to go away, then those files should be deleted.
> 

Freemangordon, maybe... would it be possible to add (now stable and
tested by lot of users) OPP table from Maemo kernel-power to DTS?

-- 
Pali Rohár
pali.ro...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RESEND PATCH] rtc: rk808: rename rtc-rk808.c to rtc-rk8xx.c

2016-01-05 Thread Alexandre Belloni
Hi,

On 05/01/2016 at 16:31:38 +0800, Huang, Tao wrote :
> You and Alexandre are right. The rename is just make the driver more
> readable, i.e. let people know this driver suit for more PMIC no just
> rk808. In fact, I don't care the name is rk808 or rk8xx.
> 

For this purpose, you can add the name of the supported PMICs in the
Kconfig.

> The key change of this patch is try to dis-coupling rk808 driver and RTC
> driver. Because of register offset and function is vary between
> different PMIC, we believe it is hard to write one PMIC driver to suit
> all PMIC. So we hope RTC driver can share between all PMIC from rockchip.
> 
> Please review this code:
> 
> -static int rk808_rtc_probe(struct platform_device *pdev)
> +static int rk8xx_rtc_probe(struct platform_device *pdev)
>  {
> - struct rk808 *rk808 = dev_get_drvdata(pdev->dev.parent);
>   ...
> + struct i2c_client *client = to_i2c_client(pdev->dev.parent);
> 
>   ...
> 
> - rk808_rtc->rk808 = rk808;
> + rk8xx_rtc->regmap = devm_regmap_init_i2c(client,
> +  &rk8xx_rtc_regmap_config);
>   ...
> + rk8xx_rtc->i2c = client;
> 
> Old driver have struct rk808 pointer, which defined on
> include/linux/mfd/rk808.h
> If we write new PMIC driver, for example rk818, define a new struct
> rk818. How can we get this pointer from RTC driver?
> 
> So another way to solve this problem is introduce common struct share
> between all PMIC driver. For example rk8xx.
> 

That is probably the best solution. If you want to reuse drivers between
pmic, then have a common structure that you can pass to those drivers.

I believe the regmap_init_i2c should stay in the MFD driver.

-- 
Alexandre Belloni, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i2c/designware: enable i2c controller to suspend/resume asynchronously

2016-01-05 Thread Jarkko Nikula

Hi

On 12/24/2015 04:30 PM, Fu, Zhonghui wrote:

Now, PM core supports asynchronous suspend/resume mode for devices
during system suspend/resume, and the power state transition of one
device may be completed in separate kernel thread. PM core ensures
all power state transition dependency between devices. This patch
enables designware i2c controllers to suspend/resume asynchronously.
This will take advantage of multicore and improve system suspend/resume
speed. After enabling all i2c devices, i2c adapters and i2c controllers
on ASUS T100TA tablet, the system suspend-to-idle time is reduced to
about 510ms from 750ms, and the system resume time is reduced to about
790ms from 900ms.


Nice reduction :-)


diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c 
b/drivers/i2c/busses/i2c-designware-platdrv.c
index 6b00061..395130b 100644
--- a/drivers/i2c/busses/i2c-designware-platdrv.c
+++ b/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -230,6 +230,7 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
}

adap = &dev->adapter;
+   device_enable_async_suspend(&pdev->dev);
adap->owner = THIS_MODULE;
adap->class = I2C_CLASS_DEPRECATED;
ACPI_COMPANION_SET(&adap->dev, ACPI_COMPANION(&pdev->dev));


Does device_enable_async_suspend() need to be called before enabling 
runtime PM? I suppose not since there appears to have also related sysfs 
node for toggling it runtime.


I'm thinking if you could move the device_enable_async_suspend() call 
into drivers/i2c/busses/i2c-designware-core.c: i2c_dw_probe() and then 
also PCI enumerated adapter could take advantage of it.


--
Jarkko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 3/3] serial: amba-pl011: add ACPI support to AMBA probe

2016-01-05 Thread G Gregory
On 4 January 2016 at 23:13, Timur Tabi  wrote:
> On Wed, Dec 23, 2015 at 8:19 AM, Aleksey Makarov
>  wrote:
>> From: Graeme Gregory 
>>
>> In ACPI this device is only defined in SBSA mode so
>> if we are coming from ACPI use this mode.
>>
>> Signed-off-by: Graeme Gregory 
>> Signed-off-by: Aleksey Makarov 
>> ---
>>  drivers/tty/serial/amba-pl011.c | 37 ++---
>>  1 file changed, 26 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/tty/serial/amba-pl011.c 
>> b/drivers/tty/serial/amba-pl011.c
>> index 899a771..974cb9e 100644
>> --- a/drivers/tty/serial/amba-pl011.c
>> +++ b/drivers/tty/serial/amba-pl011.c
>> @@ -2368,18 +2368,33 @@ static int pl011_probe(struct amba_device *dev, 
>> const struct amba_id *id)
>> if (!uap)
>> return -ENOMEM;
>>
>> -   uap->clk = devm_clk_get(&dev->dev, NULL);
>> -   if (IS_ERR(uap->clk))
>> -   return PTR_ERR(uap->clk);
>> -
>> -   uap->vendor = vendor;
>> -   uap->lcrh_rx = vendor->lcrh_rx;
>> -   uap->lcrh_tx = vendor->lcrh_tx;
>> -   uap->fifosize = vendor->get_fifosize(dev);
>> -   uap->port.irq = dev->irq[0];
>> -   uap->port.ops = &amba_pl011_pops;
>> +   /* ACPI only defines SBSA variant */
>> +   if (has_acpi_companion(&dev->dev)) {
>> +   /*
>> +* According to ARM ARMH0011 is currently the only mapping
>> +* of pl011 in ACPI and it's mapped to SBSA UART mode
>> +*/
>> +   uap->vendor = &vendor_sbsa;
>> +   uap->fifosize   = 32;
>> +   uap->port.ops   = &sbsa_uart_pops;
>> +   uap->fixed_baud = 115200;
>
> I'm confused by this patch.  We already have code like this in
> tty-next, in the form of sbsa_uart_probe():
>
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/gregkh/tty/+/tty-next/drivers/tty/serial/amba-pl011.c#2553
>
Because Russell expressed unhappiness at that code existing. So this
is an alternative method to do same thing with ACPI.

If the "arm,sbsa-uart" id was added to drivers/of/platform.c as an
AMBA id then the same could be done for DT as well.

Ultimately this patch is optional depending on maintainers opinion!

Graeme
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: perf test: Improbe bp_signal

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 04:58:00AM +, Wang Nan wrote:

SNIP

> +  *   - SIGUSR1 is delivered -> overflows_2 == 1  (nested signal)
> +  *   - sig_handler_2 return
> +  *   - sig_handler return
> +  *   - fd3 event watchpoint hit -> count3 == 1   (wp and bp in one 
> insn)
> +  *   - SIGIO is delivered   -> overflows == 2
> +  *   - fd2 event breakpoint hit -> count2 == 2
> +  *   - SIGUSR1 is delivered -> overflows_2 == 2
> +  *   - sig_handler_2 return
> +  *   - sig_handler return
> +  *   - fd3 event watchpoint hit -> count3 == 2   (standalone wp)
> +  *   - SIGIO is delivered   -> overflows = 3
> +  *   - fd2 event breakpoint hit -> count2 == 3
> +  *   - SIGUSR1 is delivered -> overflows_2 == 3
> +  *   - sig_handler_2 return
> +  *   - sig_handler return
>*
>* The test case check following error conditions:
>* - we get stuck in signal handler because of debug
> @@ -152,11 +229,13 @@ int test__bp_signal(int subtest __maybe_unused)
>*
>*/
>  
> - fd1 = bp_event(test_function, 1);
> - fd2 = bp_event(sig_handler, 0);
> + fd1 = bp_event(__test_function, 1);
> + fd2 = __xp_event(true, sig_handler, 1, SIGUSR1);
> + fd3 = wp_event((void *)&the_var, 1);
>  

spent some time to figure this out.. would attached change be more readable?

thanks,
jirka


---
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index 0bc4f76c22ca..e5349616ac3f 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -99,7 +99,7 @@ static void sig_handler(int signum __maybe_unused,
}
 }
 
-static int __xp_event(bool is_bp, void *addr, int setup_signal, int signal)
+static int __event(bool is_x, void *addr, int signal)
 {
struct perf_event_attr pe;
int fd;
@@ -109,7 +109,7 @@ static int __xp_event(bool is_bp, void *addr, int 
setup_signal, int signal)
pe.size = sizeof(struct perf_event_attr);
 
pe.config = 0;
-   pe.bp_type = is_bp ? HW_BREAKPOINT_X : HW_BREAKPOINT_W;
+   pe.bp_type = is_x ? HW_BREAKPOINT_X : HW_BREAKPOINT_W;
pe.bp_addr = (unsigned long) addr;
pe.bp_len = sizeof(long);
 
@@ -128,25 +128,23 @@ static int __xp_event(bool is_bp, void *addr, int 
setup_signal, int signal)
return TEST_FAIL;
}
 
-   if (setup_signal) {
-   fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
-   fcntl(fd, F_SETSIG, signal);
-   fcntl(fd, F_SETOWN, getpid());
-   }
+   fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
+   fcntl(fd, F_SETSIG, signal);
+   fcntl(fd, F_SETOWN, getpid());
 
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
 
return fd;
 }
 
-static int bp_event(void *addr, int setup_signal)
+static int bp_event(void *addr, int signal)
 {
-   return __xp_event(true, addr, setup_signal, SIGIO);
+   return __event(true, addr, signal);
 }
 
-static int wp_event(void *addr, int setup_signal)
+static int wp_event(void *addr, int signal)
 {
-   return __xp_event(false, addr, setup_signal, SIGIO);
+   return __event(false, addr, signal);
 }
 
 static long long bp_count(int fd)
@@ -229,9 +227,9 @@ int test__bp_signal(int subtest __maybe_unused)
 *
 */
 
-   fd1 = bp_event(__test_function, 1);
-   fd2 = __xp_event(true, sig_handler, 1, SIGUSR1);
-   fd3 = wp_event((void *)&the_var, 1);
+   fd1 = bp_event(__test_function, SIGIO);
+   fd2 = bp_event(sig_handler, SIGUSR1);
+   fd3 = wp_event((void *)&the_var, SIGIO);
 
ioctl(fd1, PERF_EVENT_IOC_ENABLE, 0);
ioctl(fd2, PERF_EVENT_IOC_ENABLE, 0);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/3] clocksource/vt8500: Increase the minimum delta

2016-01-05 Thread Daniel Lezcano

On 01/01/2016 02:24 PM, Roman Volkov wrote:

From: Roman Volkov 

The vt8500 clocksource driver declares itself as capable to handle the
minimum delay of 4 cycles by passing the value into
clockevents_config_and_register(). The vt8500_timer_set_next_event()
requires the passed cycles value to be at least 16. The impact is that
userspace hangs in nanosleep() calls with small delay intervals.

This problem is reproducible in Linux 4.2 starting from:
c6eb3f70d448 ('hrtimer: Get rid of hrtimer softirq')

Signed-off-by: Roman Volkov 
Acked-by: Alexey Charkov 


Hi Roman,

I looked at the email thread, and IIUC if set_next_event fails, the 
system freeze. Your patch fixes the issue for your driver but not the 
real issue because if set_next_event fails, at least a warning should 
appear in the log or better nanosleep should fail gracefully.


BTW why min delta is MIN_OSCR_DELTA * 2 in clockevents_config_and_register ?


---
  drivers/clocksource/vt8500_timer.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/clocksource/vt8500_timer.c 
b/drivers/clocksource/vt8500_timer.c
index a92e94b..dfc3bb4 100644
--- a/drivers/clocksource/vt8500_timer.c
+++ b/drivers/clocksource/vt8500_timer.c
@@ -50,6 +50,8 @@

  #define msecs_to_loops(t) (loops_per_jiffy / 1000 * HZ * t)

+#define MIN_OSCR_DELTA 16
+
  static void __iomem *regbase;

  static cycle_t vt8500_timer_read(struct clocksource *cs)
@@ -80,7 +82,7 @@ static int vt8500_timer_set_next_event(unsigned long cycles,
cpu_relax();
writel((unsigned long)alarm, regbase + TIMER_MATCH_VAL);

-   if ((signed)(alarm - clocksource.read(&clocksource)) <= 16)
+   if ((signed)(alarm - clocksource.read(&clocksource)) <= MIN_OSCR_DELTA)
return -ETIME;

writel(1, regbase + TIMER_IER_VAL);
@@ -151,7 +153,7 @@ static void __init vt8500_timer_init(struct device_node *np)
pr_err("%s: setup_irq failed for %s\n", __func__,
clockevent.name);
clockevents_config_and_register(&clockevent, VT8500_TIMER_HZ,
-   4, 0xf000);
+   MIN_OSCR_DELTA * 2, 0xf000);
  }

  CLOCKSOURCE_OF_DECLARE(vt8500, "via,vt8500-timer", vt8500_timer_init);




--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: perf test: Improbe bp_signal

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 04:58:00AM +, Wang Nan wrote:
> Will Deacon [1] has some question on patch [2]. This patch improves
> test__bp_signal so we can test:

there's typo (s/Improbe/Improve) in subject ;-)

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/10] i2c-mux: add common core data for every mux instance

2016-01-05 Thread Peter Rosin
Hi Guenter,

On 2016-01-04 16:37, Guenter Roeck wrote:
> On 01/04/2016 07:10 AM, Peter Rosin wrote:
>> From: Peter Rosin 
>>
>> The initial core mux structure starts off small with only the parent
>> adapter pointer, which all muxes have, and a priv pointer for mux
>> driver private data.
>>
>> Add i2c_mux_alloc function to unify the creation of a mux.
>>
>> Where appropriate, pass around the mux core structure instead of the
>> parent adapter or the driver private data.
>>
>> Remove the parent adapter pointer from the driver private data for all
>> mux drivers.
>>
>> Signed-off-by: Peter Rosin 
>> ---
>>   drivers/i2c/i2c-mux.c  | 35 
>> -
>>   drivers/i2c/muxes/i2c-arb-gpio-challenge.c | 24 +++-
>>   drivers/i2c/muxes/i2c-mux-gpio.c   | 20 +
>>   drivers/i2c/muxes/i2c-mux-pca9541.c| 36 
>> --
>>   drivers/i2c/muxes/i2c-mux-pca954x.c| 22 +-
>>   drivers/i2c/muxes/i2c-mux-pinctrl.c| 24 +++-
>>   drivers/i2c/muxes/i2c-mux-reg.c| 25 -
>>   include/linux/i2c-mux.h| 14 +++-
>>   8 files changed, 129 insertions(+), 71 deletions(-)
>>

*snip*

>> +struct i2c_mux_core *i2c_mux_alloc(struct device *dev, int sizeof_priv)
>> +{
>> +struct i2c_mux_core *muxc;
>> +
>> +muxc = devm_kzalloc(dev, sizeof(*muxc), GFP_KERNEL);
>> +if (!muxc)
>> +return NULL;
>> +if (sizeof_priv) {
>> +muxc->priv = devm_kzalloc(dev, sizeof_priv, GFP_KERNEL);
>> +if (!muxc->priv)
>> +goto fail;
>> +}
> 
> Why not just allocate sizeof(*muxc) + sizeof_priv in a single operation
> and then assign muxc->priv to muxc + 1 if sizeof_priv > 0 ?

Why indeed, good suggestion.

*snip*

>> @@ -134,13 +134,14 @@ static int i2c_arbitrator_probe(struct platform_device 
>> *pdev)
>>   return -EINVAL;
>>   }
>>
>> -arb = devm_kzalloc(dev, sizeof(*arb), GFP_KERNEL);
>> -if (!arb) {
>> -dev_err(dev, "Cannot allocate i2c_arbitrator_data\n");
>> +muxc = i2c_mux_alloc(dev, sizeof(*arb));
>> +if (!muxc) {
>> +dev_err(dev, "Cannot allocate i2c_mux_core structure\n");
> 
> Unnecessary error message.
> 

Right, I'll remove that (and the others just like it).

I'll see if I can cook up a v2 that also converts the i2c muxes elsewhere in
drivers/ that I wasn't aware of.

Cheers,
Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] arm64: perf test: Improbe bp_signal

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 04:58:00AM +, Wang Nan wrote:

SNIP

>* Following processing should happen:
> @@ -141,6 +203,21 @@ int test__bp_signal(int subtest __maybe_unused)
>*   - fd1 event breakpoint hit -> count1 == 1
>*   - SIGIO is delivered   -> overflows == 1
>*   - fd2 event breakpoint hit -> count2 == 1
> +  *   - SIGUSR1 is delivered -> overflows_2 == 1  (nested signal)
> +  *   - sig_handler_2 return
> +  *   - sig_handler return
> +  *   - fd3 event watchpoint hit -> count3 == 1   (wp and bp in one 
> insn)
> +  *   - SIGIO is delivered   -> overflows == 2
> +  *   - fd2 event breakpoint hit -> count2 == 2
> +  *   - SIGUSR1 is delivered -> overflows_2 == 2
> +  *   - sig_handler_2 return
> +  *   - sig_handler return
> +  *   - fd3 event watchpoint hit -> count3 == 2   (standalone wp)
> +  *   - SIGIO is delivered   -> overflows = 3
> +  *   - fd2 event breakpoint hit -> count2 == 3
> +  *   - SIGUSR1 is delivered -> overflows_2 == 3
> +  *   - sig_handler_2 return
> +  *   - sig_handler return

also each line in here could be prefixed with 'code action'
that led to the result on the line, like:


 * exec:  result:
 *
 * __test_function  - fd1 event breakpoint hit -> count1 == 1
 *  - SIGIO is delivered   -> overflows == 1
 * sig_handler  - fd2 event breakpoint hit -> count2 == 1
 *  - SIGUSR1 is delivered -> overflows_2 == 1  
(nested signal)
 *  - sig_handler_2 return
 *  - sig_handler return
 * incq (%rdi)  - fd3 event watchpoint hit -> count3 == 1   (wp 
and bp in one insn)
 *  - SIGIO is delivered   -> overflows == 2


hum.. but it might take all the fun out of it ;-)

jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf_event_open() ABI compatability

2016-01-05 Thread Peter Zijlstra
On Mon, Jan 04, 2016 at 05:19:13PM -0500, Vince Weaver wrote:
> 
> So I think this might be revisiting an issue that has come up before, but
> we're having backward compatability issues with PAPI and libpfm4 and
> the perf_event_open() system call.
> 
> If a user specifies exclude_guest=1 on an older kernel that doesn't 
> support it, we get the awesome EINVAL error return code and it often
> takes hours to track down the cause.
> 
> Now in theory the ABI is maintained via the "size" field.  So you can
> figure out the size of the attr struct by setting an invalid size
> and then getting E2BIG with size set to the value the kernel expects.
>   
> This doesn't help with exclude_guest though, as that's in the giant union
> in the middle of the attr, and there's absolutely no mechanism at all
> to tell when that has been extended.
> 
> Is there any solution to all of this, except having to carry around a big 
> table of kernel version numbers for when features were added?

The perf tool does a probe thing where it will, in reverse order of
feature addition remove flags.

The advantage of the dynamic probing is that it will work with franken
kernels that have bits backported; where relying on the kernel version
number is pointless.

But yes, this is all somewhat fugly.

> Ideally we would somehow want E2BIG returned plus the size of __reserved_1 
> if the value of __reserved_1 is not zero.  I suppose at this point in the 
> game it's too late for this to be much help and we're going to have to
> work around the problem forever anyway.

Right :/ So I was hoping some of that extended error reporting stuff
from Alexander Shishkin would help out with this. Not sure where that
stranded -- I think in the attempt to make it too generic or so.

But yes, since that too will only be available in new kernels, old
kernels will still have to cope.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH 1/2] f2fs: check node id earily when readahead NAT page

2016-01-05 Thread Chao Yu
Hi Jaegeuk,

This patch adds check for some missing cases in ("f2fs: return early
when trying to read null nid "). Merging into that patch or merging
separately is ok to me.

Thanks,

> -Original Message-
> From: Chao Yu [mailto:chao2...@samsung.com]
> Sent: Tuesday, January 05, 2016 4:52 PM
> To: Jaegeuk Kim
> Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net
> Subject: [f2fs-dev] [PATCH 1/2] f2fs: check node id earily when readahead NAT 
> page
> 
> Add node id check in ra_node_page and get_node_page_ra like get_node_page.
> 
> Signed-off-by: Chao Yu 
> ---
>  fs/f2fs/node.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 6d5f548..c1ddf3d 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1041,6 +1041,10 @@ void ra_node_page(struct f2fs_sb_info *sbi, nid_t nid)
>   struct page *apage;
>   int err;
> 
> + if (!nid)
> + return;
> + f2fs_bug_on(sbi, check_nid_range(sbi, nid));
> +
>   apage = find_get_page(NODE_MAPPING(sbi), nid);
>   if (apage && PageUptodate(apage)) {
>   f2fs_put_page(apage, 0);
> @@ -1108,6 +1112,7 @@ struct page *get_node_page_ra(struct page *parent, int 
> start)
>   nid = get_nid(parent, start, false);
>   if (!nid)
>   return ERR_PTR(-ENOENT);
> + f2fs_bug_on(sbi, check_nid_range(sbi, nid));
>  repeat:
>   page = grab_cache_page(NODE_MAPPING(sbi), nid);
>   if (!page)
> @@ -1127,9 +1132,9 @@ repeat:
>   end = start + MAX_RA_NODE;
>   end = min(end, NIDS_PER_BLOCK);
>   for (i = start + 1; i < end; i++) {
> - nid_t tnid = get_nid(parent, i, false);
> - if (!tnid)
> - continue;
> + nid_t tnid;
> +
> + tnid = get_nid(parent, i, false);
>   ra_node_page(sbi, tnid);
>   }
> 
> --
> 2.6.3
> 
> 
> 
> --
> ___
> Linux-f2fs-devel mailing list
> linux-f2fs-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1 v2] include/uapi/linux/sockios.h: mark SIOCRTMSG unused

2016-01-05 Thread Heinrich Schuchardt
IOCTL SIOCRTMSG does nothing but return EINVAL.

So comment it as unused.

SIOCRTMSG is only used in:
* net/ipv4/af_inet.c
* include/uapi/linux/sockios.h

inet_ioctl calls ip_rt_ioctl.
ip_rt_ioctl only handles SIOCADDRT and SIOCDELRT and returns -EINVAL
otherwise.

Signed-off-by: Heinrich Schuchardt 
---
 include/uapi/linux/sockios.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/sockios.h b/include/uapi/linux/sockios.h
index e888b1a..8e7890b 100644
--- a/include/uapi/linux/sockios.h
+++ b/include/uapi/linux/sockios.h
@@ -27,7 +27,7 @@
 /* Routing table calls. */
 #define SIOCADDRT  0x890B  /* add routing table entry  */
 #define SIOCDELRT  0x890C  /* delete routing table entry   */
-#define SIOCRTMSG  0x890D  /* call to routing system   */
+#define SIOCRTMSG  0x890D  /* unused   */
 
 /* Socket configuration controls. */
 #define SIOCGIFNAME0x8910  /* get iface name   */
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [STABLE] kernel oops which can be fixed by peterz's patches

2016-01-05 Thread Peter Zijlstra
On Tue, Jan 05, 2016 at 05:52:11PM +0900, Byungchul Park wrote:
> 
> Upstream commits to be applied
> ==
> 
> e3fca9e: sched: Replace post_schedule with a balance callback list
> 4c9a4bc: sched: Allow balance callbacks for check_class_changed()
> 8046d68: sched,rt: Remove return value from pull_rt_task()
> fd7a4be: sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to 
> balance callbacks
> 0ea60c2: sched,dl: Remove return value from pull_dl_task()
> 9916e21: sched, dl: Convert switched_{from, to}_dl() / prio_changed_dl() to 
> balance callbacks
> 
> The reason why these should be applied
> ==
> 
> Our products developed using 3.16 kernel, faced a kernel oops which can
> be fixed with above upstreamed patches. The oops is caused by "Unable
> to handle kernel NULL pointer dereference at virtual address 00xx"
> in the call path,
> 
> __sched_setscheduler()
>   check_class_changed()
>   switched_to_fair()
>   check_preempt_curr()
>   check_preempt_wakeup()
>   find_matching_se()
>   is_same_group()
> 
> by "if (se->cfs_rq == pse->cfs_rq) // se, pse == NULL" condition.

So the reason I didn't mark them for stable is that they were non
trivial, however they've been in for a while now and nothing broke, so I
suppose backporting them isn't a problem.

> How to apply it
> ===
> 
> For stable 4.2.8+:
>   N/A (already applied)
> 
> For longterm 4.1.15:
>   Cherry-picking the upsteam commits works with a trivial conflict.
> 
> For longterm 3.18.25:
>   Refer to the backported patches in this thread.
> 
> For longterm 3.14.58:
>   Refer to the backported patches in this thread. And applying
>   additional "6c3b4d4: sched: Clean up idle task SMP logic" commit
>   makes backporting the upstream commits much simpler. So my
>   backporting patches include the patch.
> 
> For longterm 2.6.32.69 ~ 3.12.51: Need to be backported. (I didn't)

No objection as long as you've actually tested things etc..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] perf tools: Fix segfault when using -s trace_fields

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 12:03:43PM +0900, Namhyung Kim wrote:
> When the 'trace_fields' sort key is used explicitly for non-tracepoint
> events, it'll get segfault since it assumed evsel->tp_format was set.
> Skip those events in add_all_dynamic_fields().

Acked-by: Jiri Olsa 

thanks,
jirka

> 
> Signed-off-by: Namhyung Kim 
> ---
>  tools/perf/util/sort.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index e558e87cafaf..59c4c8586d79 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1955,6 +1955,9 @@ static int add_all_dynamic_fields(struct perf_evlist 
> *evlist, bool raw_trace)
>   struct perf_evsel *evsel;
>  
>   evlist__for_each(evlist, evsel) {
> + if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
> + continue;
> +
>   ret = add_evsel_fields(evsel, raw_trace);
>   if (ret < 0)
>   return ret;
> -- 
> 2.6.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[bug] wrong result of android callchain

2016-01-05 Thread He Kuang

I found a wrong result of aarch64 callchain when using perf script on
a android phone.

Here's the callchain record fragment from the output of perf script:

  init   369 [002]   339.970607: raw_syscalls:sys_enter: NR 22 (b, 7fd9e360a0, 
10, , 0, 8)
 ...
   230ac [unknown] (/system/lib64/libsurfaceflinger.so)
11a0 main (/system/bin/surfaceflinger)
   1c3fc __libc_init (/system/lib64/libc.so)
 fd0 _start (/system/bin/surfaceflinger)
29ec __dl__start (/system/bin/linker64)

The fault occured in the '[unknown]' line, from objdump result of
/system/bin/surfaceflinger, we can see the branch instruction before
0x11a0:

 # objdump /system/bin/surfaceflinger
1198:   f9400fe0ldr x0, [sp,#24]
119c:   9705bl  db0 
<_ZN7android14SurfaceFlinger3runEv@plt>
11a0:   f9400be8ldr x8, [sp,#16]
11a4:   b4c8cbz x8, 11bc 

The function '_ZN7android14SurfaceFlinger3runEv' is located at 0x3a094
~ 0x3a0ac in libsurfaceflinger.so, but perf misparsed that value to
0x230ac:

 # objdump libsurfaceflinger.so
  0003a094 <_ZN7android14SurfaceFlinger3runEv>:
3a094:   a9be4ff4stp x20, x19, [sp,#-32]!
3a098:   a9017bfdstp x29, x30, [sp,#16]
3a09c:   910043fdadd x29, sp, #0x10
3a0a0:   910c0013add x19, x0, #0x300
3a0a4:   aa1303e0mov x0, x19
3a0a8:   97fff12fbl  36564 
<_ZN7android12MessageQueue11waitMessageEv>
3a0ac:   17feb   3a0a4 
<_ZN7android14SurfaceFlinger3runEv+0x10>

There's a difference of 0x17000 between those two offsets, it seems
that this value is the VirtAddr of this dynamic library.

 # readelf -a libsurfaceflinger.so
  Program Headers:
Type   Offset VirtAddr   PhysAddr
   FileSizMemSiz  Flags  Align
LOAD   0x 0x00017000 0x00017000
   0x00057258 0x00057258  R E1000


   


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.14.58 2/7] sched: Replace post_schedule with a balance callback list

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

Generalize the post_schedule() stuff into a balance callback list.
This allows us to more easily use it outside of schedule() and cross
sched_class.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.424032...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/core.c
kernel/sched/deadline.c
kernel/sched/rt.c
kernel/sched/sched.h
---
 kernel/sched/core.c | 36 
 kernel/sched/deadline.c | 23 ---
 kernel/sched/rt.c   | 27 ---
 kernel/sched/sched.h| 19 +--
 4 files changed, 73 insertions(+), 32 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index bbe9577..cc1be56 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2179,18 +2179,30 @@ static inline void pre_schedule(struct rq *rq, struct 
task_struct *prev)
 }
 
 /* rq->lock is NOT held, but preemption is disabled */
-static inline void post_schedule(struct rq *rq)
+static void __balance_callback(struct rq *rq)
 {
-   if (rq->post_schedule) {
-   unsigned long flags;
+   struct callback_head *head, *next;
+   void (*func)(struct rq *rq);
+   unsigned long flags;
 
-   raw_spin_lock_irqsave(&rq->lock, flags);
-   if (rq->curr->sched_class->post_schedule)
-   rq->curr->sched_class->post_schedule(rq);
-   raw_spin_unlock_irqrestore(&rq->lock, flags);
+   raw_spin_lock_irqsave(&rq->lock, flags);
+   head = rq->balance_callback;
+   rq->balance_callback = NULL;
+   while (head) {
+   func = (void (*)(struct rq *))head->func;
+   next = head->next;
+   head->next = NULL;
+   head = next;
 
-   rq->post_schedule = 0;
+   func(rq);
}
+   raw_spin_unlock_irqrestore(&rq->lock, flags);
+}
+
+static inline void balance_callback(struct rq *rq)
+{
+   if (unlikely(rq->balance_callback))
+   __balance_callback(rq);
 }
 
 #else
@@ -2199,7 +2211,7 @@ static inline void pre_schedule(struct rq *rq, struct 
task_struct *p)
 {
 }
 
-static inline void post_schedule(struct rq *rq)
+static inline void balance_callback(struct rq *rq)
 {
 }
 
@@ -2220,7 +2232,7 @@ asmlinkage void schedule_tail(struct task_struct *prev)
 * FIXME: do we need to worry about rq being invalidated by the
 * task_switch?
 */
-   post_schedule(rq);
+   balance_callback(rq);
 
 #ifdef __ARCH_WANT_UNLOCKED_CTXSW
/* In this case, finish_task_switch does not reenable preemption */
@@ -2732,7 +2744,7 @@ need_resched:
} else
raw_spin_unlock_irq(&rq->lock);
 
-   post_schedule(rq);
+   balance_callback(rq);
 
sched_preempt_enable_no_resched();
if (need_resched())
@@ -6902,7 +6914,7 @@ void __init sched_init(void)
rq->sd = NULL;
rq->rd = NULL;
rq->cpu_power = SCHED_POWER_SCALE;
-   rq->post_schedule = 0;
+   rq->balance_callback = NULL;
rq->active_balance = 0;
rq->next_balance = jiffies;
rq->push_cpu = 0;
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 8d3c5dd..aaefe1b 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -210,6 +210,18 @@ static inline int has_pushable_dl_tasks(struct rq *rq)
 
 static int push_dl_task(struct rq *rq);
 
+static DEFINE_PER_CPU(struct callback_head, dl_balance_head);
+
+static void push_dl_tasks(struct rq *);
+
+static inline void queue_push_tasks(struct rq *rq)
+{
+   if (!has_pushable_dl_tasks(rq))
+   return;
+
+   queue_balance_callback(rq, &per_cpu(dl_balance_head, rq->cpu), 
push_dl_tasks);
+}
+
 #else
 
 static inline
@@ -232,6 +244,9 @@ void dec_dl_migration(struct sched_dl_entity *dl_se, struct 
dl_rq *dl_rq)
 {
 }
 
+static inline void queue_push_tasks(struct rq *rq)
+{
+}
 #endif /* CONFIG_SMP */
 
 static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags);
@@ -1005,7 +1020,7 @@ struct task_struct *pick_next_task_dl(struct rq *rq)
 #endif
 
 #ifdef CONFIG_SMP
-   rq->post_schedule = has_pushable_dl_tasks(rq);
+   queue_push_tasks(rq);
 #endif /* CONFIG_SMP */
 
return p;
@@ -1422,11 +1437,6 @@ static void pre_schedule_dl(struct rq *rq, struct 
task_struct *prev)
pull_dl_task(rq);
 }
 
-static void post_schedule_dl(struct rq *rq)
-{
-   push_dl_tasks(rq);
-}
-
 /*
  * Since the task is not running and a reschedule is not going to happen
  * anytime soon on its runqueue, we try pushing it away now.
@@ -1615,7 +162

[PATCH for v3.14.58 4/7] sched,rt: Remove return value from pull_rt_task()

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

In order to be able to use pull_rt_task() from a callback, we need to
do away with the return value.

Since the return value indicates if we should reschedule, do this
inside the function. Since not all callers currently do this, this can
increase the number of reschedules due rt balancing.

Too many reschedules is not a correctness issues, too few are.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.679002...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/rt.c
---
 kernel/sched/rt.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 2b980d0..d235fd7 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1652,14 +1652,15 @@ static void push_rt_tasks(struct rq *rq)
;
 }
 
-static int pull_rt_task(struct rq *this_rq)
+static void pull_rt_task(struct rq *this_rq)
 {
-   int this_cpu = this_rq->cpu, ret = 0, cpu;
+   int this_cpu = this_rq->cpu, cpu;
+   bool resched = false;
struct task_struct *p;
struct rq *src_rq;
 
if (likely(!rt_overloaded(this_rq)))
-   return 0;
+   return;
 
/*
 * Match the barrier from rt_set_overloaded; this guarantees that if we
@@ -1716,7 +1717,7 @@ static int pull_rt_task(struct rq *this_rq)
if (p->prio < src_rq->curr->prio)
goto skip;
 
-   ret = 1;
+   resched = true;
 
deactivate_task(src_rq, p, 0);
set_task_cpu(p, this_cpu);
@@ -1732,7 +1733,8 @@ skip:
double_unlock_balance(this_rq, src_rq);
}
 
-   return ret;
+   if (resched)
+   resched_task(this_rq->curr);
 }
 
 static void pre_schedule_rt(struct rq *rq, struct task_struct *prev)
@@ -1835,8 +1837,7 @@ static void switched_from_rt(struct rq *rq, struct 
task_struct *p)
if (!p->on_rq || rq->rt.rt_nr_running)
return;
 
-   if (pull_rt_task(rq))
-   resched_task(rq->curr);
+   pull_rt_task(rq);
 }
 
 void init_sched_rt_class(void)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.14.58 6/7] sched,dl: Remove return value from pull_dl_task()

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

In order to be able to use pull_dl_task() from a callback, we need to
do away with the return value.

Since the return value indicates if we should reschedule, do this
inside the function. Since not all callers currently do this, this can
increase the number of reschedules due rt balancing.

Too many reschedules is not a correctness issues, too few are.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.859398...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/deadline.c
---
 kernel/sched/deadline.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index aaefe1b..ec1f21d 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1351,15 +1351,16 @@ static void push_dl_tasks(struct rq *rq)
;
 }
 
-static int pull_dl_task(struct rq *this_rq)
+static void pull_dl_task(struct rq *this_rq)
 {
-   int this_cpu = this_rq->cpu, ret = 0, cpu;
+   int this_cpu = this_rq->cpu, cpu;
struct task_struct *p;
+   bool resched = false;
struct rq *src_rq;
u64 dmin = LONG_MAX;
 
if (likely(!dl_overloaded(this_rq)))
-   return 0;
+   return;
 
/*
 * Match the barrier from dl_set_overloaded; this guarantees that if we
@@ -1414,7 +1415,7 @@ static int pull_dl_task(struct rq *this_rq)
   src_rq->curr->dl.deadline))
goto skip;
 
-   ret = 1;
+   resched = true;
 
deactivate_task(src_rq, p, 0);
set_task_cpu(p, this_cpu);
@@ -1427,7 +1428,8 @@ skip:
double_unlock_balance(this_rq, src_rq);
}
 
-   return ret;
+   if (resched)
+   resched_task(this_rq->curr);
 }
 
 static void pre_schedule_dl(struct rq *rq, struct task_struct *prev)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.14.58 5/7] sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to balance callbacks

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

Remove the direct {push,pull} balancing operations from
switched_{from,to}_rt() / prio_changed_rt() and use the balance
callback queue.

Again, err on the side of too many reschedules; since too few is a
hard bug while too many is just annoying.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.766832...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/rt.c
---
 kernel/sched/rt.c | 35 +++
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index d235fd7..0fb72ae 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -315,16 +315,23 @@ static inline int has_pushable_tasks(struct rq *rq)
return !plist_head_empty(&rq->rt.pushable_tasks);
 }
 
-static DEFINE_PER_CPU(struct callback_head, rt_balance_head);
+static DEFINE_PER_CPU(struct callback_head, rt_push_head);
+static DEFINE_PER_CPU(struct callback_head, rt_pull_head);
 
 static void push_rt_tasks(struct rq *);
+static void pull_rt_task(struct rq *);
 
 static inline void queue_push_tasks(struct rq *rq)
 {
if (!has_pushable_tasks(rq))
return;
 
-   queue_balance_callback(rq, &per_cpu(rt_balance_head, rq->cpu), 
push_rt_tasks);
+   queue_balance_callback(rq, &per_cpu(rt_push_head, rq->cpu), 
push_rt_tasks);
+}
+
+static inline void queue_pull_task(struct rq *rq)
+{
+   queue_balance_callback(rq, &per_cpu(rt_pull_head, rq->cpu), 
pull_rt_task);
 }
 
 static void enqueue_pushable_task(struct rq *rq, struct task_struct *p)
@@ -1837,7 +1844,7 @@ static void switched_from_rt(struct rq *rq, struct 
task_struct *p)
if (!p->on_rq || rq->rt.rt_nr_running)
return;
 
-   pull_rt_task(rq);
+   queue_pull_task(rq);
 }
 
 void init_sched_rt_class(void)
@@ -1858,8 +1865,6 @@ void init_sched_rt_class(void)
  */
 static void switched_to_rt(struct rq *rq, struct task_struct *p)
 {
-   int check_resched = 1;
-
/*
 * If we are already running, then there's nothing
 * that needs to be done. But if we are not running
@@ -1869,13 +1874,12 @@ static void switched_to_rt(struct rq *rq, struct 
task_struct *p)
 */
if (p->on_rq && rq->curr != p) {
 #ifdef CONFIG_SMP
-   if (rq->rt.overloaded && push_rt_task(rq) &&
-   /* Don't resched if we changed runqueues */
-   rq != task_rq(p))
-   check_resched = 0;
-#endif /* CONFIG_SMP */
-   if (check_resched && p->prio < rq->curr->prio)
+   if (rq->rt.overloaded)
+   queue_push_tasks(rq);
+#else
+   if (p->prio < rq->curr->prio)
resched_task(rq->curr);
+#endif /* CONFIG_SMP */
}
 }
 
@@ -1896,14 +1900,13 @@ prio_changed_rt(struct rq *rq, struct task_struct *p, 
int oldprio)
 * may need to pull tasks to this runqueue.
 */
if (oldprio < p->prio)
-   pull_rt_task(rq);
+   queue_pull_task(rq);
+
/*
 * If there's a higher priority task waiting to run
-* then reschedule. Note, the above pull_rt_task
-* can release the rq lock and p could migrate.
-* Only reschedule if p is still on the same runqueue.
+* then reschedule.
 */
-   if (p->prio > rq->rt.highest_prio.curr && rq->curr == p)
+   if (p->prio > rq->rt.highest_prio.curr)
resched_task(p);
 #else
/* For UP simply resched on drop of prio */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.14.58 1/7] sched: Clean up idle task SMP logic

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

The idle post_schedule flag is just a vile waste of time, furthermore
it appears unneeded, move the idle_enter_fair() call into
pick_next_task_idle().

Signed-off-by: Peter Zijlstra 
Cc: Daniel Lezcano 
Cc: Vincent Guittot 
Cc: alex@linaro.org
Cc: mi...@kernel.org
Cc: Steven Rostedt 
Link: http://lkml.kernel.org/n/tip-aljykihtxjt3mkokxi0qz...@git.kernel.org
Signed-off-by: Ingo Molnar 
---
 kernel/sched/idle_task.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c
index 516c3d9..d08678d 100644
--- a/kernel/sched/idle_task.c
+++ b/kernel/sched/idle_task.c
@@ -19,11 +19,6 @@ static void pre_schedule_idle(struct rq *rq, struct 
task_struct *prev)
idle_exit_fair(rq);
rq_last_tick_reset(rq);
 }
-
-static void post_schedule_idle(struct rq *rq)
-{
-   idle_enter_fair(rq);
-}
 #endif /* CONFIG_SMP */
 /*
  * Idle tasks are unconditionally rescheduled:
@@ -37,8 +32,7 @@ static struct task_struct *pick_next_task_idle(struct rq *rq)
 {
schedstat_inc(rq, sched_goidle);
 #ifdef CONFIG_SMP
-   /* Trigger the post schedule to do an idle_enter for CFS */
-   rq->post_schedule = 1;
+   idle_enter_fair(rq);
 #endif
return rq->idle;
 }
@@ -102,7 +96,6 @@ const struct sched_class idle_sched_class = {
 #ifdef CONFIG_SMP
.select_task_rq = select_task_rq_idle,
.pre_schedule   = pre_schedule_idle,
-   .post_schedule  = post_schedule_idle,
 #endif
 
.set_curr_task  = set_curr_task_idle,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.14.58 7/7] sched, dl: Convert switched_{from, to}_dl() / prio_changed_dl() to balance callbacks

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

Remove the direct {push,pull} balancing operations from
switched_{from,to}_dl() / prio_changed_dl() and use the balance
callback queue.

Again, err on the side of too many reschedules; since too few is a
hard bug while too many is just annoying.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.968262...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/deadline.c
---
 kernel/sched/deadline.c | 34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index ec1f21d..6ab59bb 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -210,16 +210,23 @@ static inline int has_pushable_dl_tasks(struct rq *rq)
 
 static int push_dl_task(struct rq *rq);
 
-static DEFINE_PER_CPU(struct callback_head, dl_balance_head);
+static DEFINE_PER_CPU(struct callback_head, dl_push_head);
+static DEFINE_PER_CPU(struct callback_head, dl_pull_head);
 
 static void push_dl_tasks(struct rq *);
+static void pull_dl_task(struct rq *);
 
 static inline void queue_push_tasks(struct rq *rq)
 {
if (!has_pushable_dl_tasks(rq))
return;
 
-   queue_balance_callback(rq, &per_cpu(dl_balance_head, rq->cpu), 
push_dl_tasks);
+   queue_balance_callback(rq, &per_cpu(dl_push_head, rq->cpu), 
push_dl_tasks);
+}
+
+static inline void queue_pull_task(struct rq *rq)
+{
+   queue_balance_callback(rq, &per_cpu(dl_pull_head, rq->cpu), 
pull_dl_task);
 }
 
 #else
@@ -247,6 +254,10 @@ void dec_dl_migration(struct sched_dl_entity *dl_se, 
struct dl_rq *dl_rq)
 static inline void queue_push_tasks(struct rq *rq)
 {
 }
+
+static inline void queue_pull_task(struct rq *rq)
+{
+}
 #endif /* CONFIG_SMP */
 
 static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags);
@@ -1541,7 +1552,7 @@ static void switched_from_dl(struct rq *rq, struct 
task_struct *p)
 * from an overloaded cpu, if any.
 */
if (!rq->dl.dl_nr_running)
-   pull_dl_task(rq);
+   queue_pull_task(rq);
 #endif
 }
 
@@ -1551,8 +1562,6 @@ static void switched_from_dl(struct rq *rq, struct 
task_struct *p)
  */
 static void switched_to_dl(struct rq *rq, struct task_struct *p)
 {
-   int check_resched = 1;
-
/*
 * If p is throttled, don't consider the possibility
 * of preempting rq->curr, the check will be done right
@@ -1563,12 +1572,12 @@ static void switched_to_dl(struct rq *rq, struct 
task_struct *p)
 
if (p->on_rq || rq->curr != p) {
 #ifdef CONFIG_SMP
-   if (rq->dl.overloaded && push_dl_task(rq) && rq != task_rq(p))
-   /* Only reschedule if pushing failed */
-   check_resched = 0;
-#endif /* CONFIG_SMP */
-   if (check_resched && task_has_dl_policy(rq->curr))
+   if (rq->dl.overloaded)
+   queue_push_tasks(rq);
+#else
+   if (task_has_dl_policy(rq->curr))
check_preempt_curr_dl(rq, p, 0);
+#endif /* CONFIG_SMP */
}
 }
 
@@ -1588,15 +1597,14 @@ static void prio_changed_dl(struct rq *rq, struct 
task_struct *p,
 * or lowering its prio, so...
 */
if (!rq->dl.overloaded)
-   pull_dl_task(rq);
+   queue_pull_task(rq);
 
/*
 * If we now have a earlier deadline task than p,
 * then reschedule, provided p is still on this
 * runqueue.
 */
-   if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline) &&
-   rq->curr == p)
+   if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline))
resched_task(p);
 #else
/*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.14.58 3/7] sched: Allow balance callbacks for check_class_changed()

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

In order to remove dropping rq->lock from the
switched_{to,from}()/prio_changed() sched_class methods, run the
balance callbacks after it.

We need to remove dropping rq->lock because its buggy,
suppose using sched_setattr()/sched_setscheduler() to change a running
task from FIFO to OTHER.

By the time we get to switched_from_rt() the task is already enqueued
on the cfs runqueues. If switched_from_rt() does pull_rt_task() and
drops rq->lock, load-balancing can come in and move our task @p to
another rq.

The subsequent switched_to_fair() still assumes @p is on @rq and bad
things will happen.

By using balance callbacks we delay the load-balancing operations
{rt,dl}x{push,pull} until we've done all the important work and the
task is fully set up.

Furthermore, the balance callbacks do not know about @p, therefore
they cannot get confused like this.

Reported-by: Mike Galbraith 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Link: http://lkml.kernel.org/r/20150611124742.615343...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/core.c
---
 kernel/sched/core.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cc1be56..459cc86 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -937,6 +937,13 @@ inline int task_curr(const struct task_struct *p)
return cpu_curr(task_cpu(p)) == p;
 }
 
+/*
+ * switched_from, switched_to and prio_changed must _NOT_ drop rq->lock,
+ * use the balance_callback list if you want balancing.
+ *
+ * this means any call to check_class_changed() must be followed by a call to
+ * balance_callback().
+ */
 static inline void check_class_changed(struct rq *rq, struct task_struct *p,
   const struct sched_class *prev_class,
   int oldprio)
@@ -1423,8 +1430,12 @@ ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int 
wake_flags)
 
p->state = TASK_RUNNING;
 #ifdef CONFIG_SMP
-   if (p->sched_class->task_woken)
+   if (p->sched_class->task_woken) {
+   /*
+* XXX can drop rq->lock; most likely ok.
+*/
p->sched_class->task_woken(rq, p);
+   }
 
if (rq->idle_stamp) {
u64 delta = rq_clock(rq) - rq->idle_stamp;
@@ -3006,7 +3017,11 @@ void rt_mutex_setprio(struct task_struct *p, int prio)
 
check_class_changed(rq, p, prev_class, oldprio);
 out_unlock:
+   preempt_disable(); /* avoid rq from going away on us */
__task_rq_unlock(rq);
+
+   balance_callback(rq);
+   preempt_enable();
 }
 #endif
 
@@ -3512,10 +3527,17 @@ change:
enqueue_task(rq, p, 0);
 
check_class_changed(rq, p, prev_class, oldprio);
+   preempt_disable(); /* avoid rq from going away on us */
task_rq_unlock(rq, p, &flags);
 
rt_mutex_adjust_pi(p);
 
+   /*
+* Run balance callbacks after we've adjusted the PI chain.
+*/
+   balance_callback(rq);
+   preempt_enable();
+
return 0;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -next] MIPS: VDSO: Fix build error with binutils 2.24 and earlier

2016-01-05 Thread Michal Marek
On 2015-12-24 13:57, James Hogan wrote:
> On Thu, Dec 24, 2015 at 12:48:12PM +, James Hogan wrote:
>> Hi Guenter,
>>
>> On Wed, Dec 23, 2015 at 09:04:31PM -0800, Guenter Roeck wrote:
>>> Commit 2a037f310bab ("MIPS: VDSO: Fix build error") tries to fix a build
>>> error seen with binutils 2.24 and earlier. However, the fix does not work,
>>> and again results in the already known build errors if the kernel is built
>>> with an earlier version of binutils.
>>>
>>> CC  arch/mips/vdso/gettimeofday.o
>>> /tmp/ccnOVbHT.s: Assembler messages:
>>> /tmp/ccnOVbHT.s:50: Error: can't resolve `_start' {*UND* section} - `L0 
>>> {.text section}
>>> /tmp/ccnOVbHT.s:374: Error: can't resolve `_start' {*UND* section} - `L0 
>>> {.text section}
>>> scripts/Makefile.build:258: recipe for target 
>>> 'arch/mips/vdso/gettimeofday.o' failed
>>> make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
>>>
>>> Fixes: 2a037f310bab ("MIPS: VDSO: Fix build error")
>>> Cc: Qais Yousef 
>>> Signed-off-by: Guenter Roeck 
>>> ---
>>> Tested with binutils 2.25 and 2.22.
>>>
>>>  arch/mips/vdso/Makefile | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
>>> index 018f8c7b94f2..14568900fc1d 100644
>>> --- a/arch/mips/vdso/Makefile
>>> +++ b/arch/mips/vdso/Makefile
>>> @@ -26,7 +26,7 @@ aflags-vdso := $(ccflags-vdso) \
>>>  # the comments on that file.
>>>  #
>>>  ifndef CONFIG_CPU_MIPSR6
>>> -  ifeq ($(call ld-ifversion, -lt, 2250, y),)
>>> +  ifeq ($(call ld-ifversion, -lt, 2250, y),y)
>>
>> I agree this is semantically correct, but there is something more evil
>> going on here.
>>
>> Originally the check was version <= 2.24
>> Qais' patch changed it to version >= 2.25 (intending version < 2.25)
>> Your patch changes it to version < 2.25
>>
>> I think the reason this fixed the problem for Qais is actually that he
>> probably had a similar toolchain version to what I'm using:
>>
>> GNU ld (Codescape GNU Tools 2015.06-05 for MIPS MTI Linux) 2.24.90
>>
>> ./scripts/ld-version.sh does this:
>>
>> print a[1]*1000 + a[2]*10 + a[3]*1 + a[4]*100 + a[5];
>>
>> which changes that version number into:
>>  2000
>> + 240
>> +  90 = 2330
>>
>> I.e. it doesn't expect a[3] to be >= 10.
>>
>> Should we do something like this (increase multipliers on a[1] and
>> a[2])?:
>>
>> diff --git a/scripts/ld-version.sh b/scripts/ld-version.sh
>> index 198580d245e0..0b67edc5bc6f 100755
>> --- a/scripts/ld-version.sh
>> +++ b/scripts/ld-version.sh
>> @@ -3,6 +3,6 @@
>>  {
>>  gsub(".*)", "");
>>  split($1,a, ".");
>> -print a[1]*1000 + a[2]*10 + a[3]*1 + a[4]*100 + a[5];
>> +print a[1]*1 + a[2]*100 + a[3]*1 + a[4]*100 + a[5];
>>  exit
>>  }
>>
>> which gives 2.24.90 => 22490.
>>
>> All call sites would need updating too to add the extra 0, but a quick
>> git grep isn't showing any other ones than this one.
> 
> Actually, linux-next includes this commit which uses ld-ifversion too:
> 
> 19a3cc83353e3bb4bc28769f8606139a3d350d2d
> "Kbuild, lto: Add Link Time Optimization support v3"

That commit needs updating for other reasons, so feel free to fix
ld-ifversion and its usage in arch/mips.

Michal

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5] perf tools: Add all matching dynamic sort keys for field name

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 12:03:44PM +0900, Namhyung Kim wrote:

SNIP

>  static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
>  {
>   char *str, *event_name, *field_name, *opt_name;
> @@ -1995,7 +2017,12 @@ static int add_dynamic_entry(struct perf_evlist 
> *evlist, const char *tok)
>   }
>  
>   if (!strcmp(field_name, "trace_fields")) {
> - ret = add_all_dynamic_fields(evlist ,raw_trace);
> + ret = add_all_dynamic_fields(evlist, raw_trace);
> + goto out;
> + }
> +
> + if (event_name == NULL) {
> + ret = add_all_matching_fields(evlist, field_name, raw_trace);
>   goto out;

should this be handled within find_evsel function:

/* case 1 */
if (event_name == NULL) {
if (evlist->nr_entries != 1) {


looks like it'd be dead code otherwise

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] BTRFS: Runs the xor function if a Block has failed

2016-01-05 Thread David Sterba
On Wed, Dec 30, 2015 at 09:15:44PM -0500, Sanidhya Solanki wrote:
> On Wed, 30 Dec 2015 18:18:26 +0100
> David Sterba  wrote:
> 
> > That's just the comment copied, the changelog does not explain why
> > it's ok to do just the run_xor there. It does not seem trivial to me.
> > Please describe that the end result after the code change is expected.
> 
> In the RAID 6 case after a failure, we discover that the failure
> affected the entire P stripe, without any bad data occurring. Hence, we
> xor the previously stored parity data to return the data that was lost
> in the P stripe failure.
> 
> The xor-red data is from the parity blocks. Hence, we are left with 
> recovered data belonging to the P stripe.

If the data a rerecovered, why is -EIO still returned? Also, I see some
post-recovery steps eg. for the damaged P stripes (at label pstripes)
and I'd expect something similar for the case you're fixing.

I'm not familiar with the raid56 implementation but the fix looks
suspiciously trivial and I doubt that the xor was omitted out of
laziness.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.18.25 1/6] sched: Replace post_schedule with a balance callback list

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

Generalize the post_schedule() stuff into a balance callback list.
This allows us to more easily use it outside of schedule() and cross
sched_class.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.424032...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/core.c
---
 kernel/sched/core.c | 36 
 kernel/sched/deadline.c | 21 +++--
 kernel/sched/rt.c   | 25 +++--
 kernel/sched/sched.h| 19 +--
 4 files changed, 63 insertions(+), 38 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a38f987..3dff0ef 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2240,23 +2240,35 @@ static void finish_task_switch(struct rq *rq, struct 
task_struct *prev)
 #ifdef CONFIG_SMP
 
 /* rq->lock is NOT held, but preemption is disabled */
-static inline void post_schedule(struct rq *rq)
+static void __balance_callback(struct rq *rq)
 {
-   if (rq->post_schedule) {
-   unsigned long flags;
+   struct callback_head *head, *next;
+   void (*func)(struct rq *rq);
+   unsigned long flags;
 
-   raw_spin_lock_irqsave(&rq->lock, flags);
-   if (rq->curr->sched_class->post_schedule)
-   rq->curr->sched_class->post_schedule(rq);
-   raw_spin_unlock_irqrestore(&rq->lock, flags);
+   raw_spin_lock_irqsave(&rq->lock, flags);
+   head = rq->balance_callback;
+   rq->balance_callback = NULL;
+   while (head) {
+   func = (void (*)(struct rq *))head->func;
+   next = head->next;
+   head->next = NULL;
+   head = next;
 
-   rq->post_schedule = 0;
+   func(rq);
}
+   raw_spin_unlock_irqrestore(&rq->lock, flags);
+}
+
+static inline void balance_callback(struct rq *rq)
+{
+   if (unlikely(rq->balance_callback))
+   __balance_callback(rq);
 }
 
 #else
 
-static inline void post_schedule(struct rq *rq)
+static inline void balance_callback(struct rq *rq)
 {
 }
 
@@ -2277,7 +2289,7 @@ asmlinkage __visible void schedule_tail(struct 
task_struct *prev)
 * FIXME: do we need to worry about rq being invalidated by the
 * task_switch?
 */
-   post_schedule(rq);
+   balance_callback(rq);
 
 #ifdef __ARCH_WANT_UNLOCKED_CTXSW
/* In this case, finish_task_switch does not reenable preemption */
@@ -2804,7 +2816,7 @@ need_resched:
} else
raw_spin_unlock_irq(&rq->lock);
 
-   post_schedule(rq);
+   balance_callback(rq);
 
sched_preempt_enable_no_resched();
if (need_resched())
@@ -6973,7 +6985,7 @@ void __init sched_init(void)
rq->sd = NULL;
rq->rd = NULL;
rq->cpu_capacity = SCHED_CAPACITY_SCALE;
-   rq->post_schedule = 0;
+   rq->balance_callback = NULL;
rq->active_balance = 0;
rq->next_balance = jiffies;
rq->push_cpu = 0;
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index fc4f98b1..3bdf558 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -213,9 +213,16 @@ static inline bool need_pull_dl_task(struct rq *rq, struct 
task_struct *prev)
return dl_task(prev);
 }
 
-static inline void set_post_schedule(struct rq *rq)
+static DEFINE_PER_CPU(struct callback_head, dl_balance_head);
+
+static void push_dl_tasks(struct rq *);
+
+static inline void queue_push_tasks(struct rq *rq)
 {
-   rq->post_schedule = has_pushable_dl_tasks(rq);
+   if (!has_pushable_dl_tasks(rq))
+   return;
+
+   queue_balance_callback(rq, &per_cpu(dl_balance_head, rq->cpu), 
push_dl_tasks);
 }
 
 #else
@@ -250,7 +257,7 @@ static inline int pull_dl_task(struct rq *rq)
return 0;
 }
 
-static inline void set_post_schedule(struct rq *rq)
+static inline void queue_push_tasks(struct rq *rq)
 {
 }
 #endif /* CONFIG_SMP */
@@ -1060,7 +1067,7 @@ struct task_struct *pick_next_task_dl(struct rq *rq, 
struct task_struct *prev)
start_hrtick_dl(rq, p);
 #endif
 
-   set_post_schedule(rq);
+   queue_push_tasks(rq);
 
return p;
 }
@@ -1469,11 +1476,6 @@ skip:
return ret;
 }
 
-static void post_schedule_dl(struct rq *rq)
-{
-   push_dl_tasks(rq);
-}
-
 /*
  * Since the task is not running and a reschedule is not going to happen
  * anytime soon on its runqueue, we try pushing it away now.
@@ -1661,7 +1663,6 @@ const struct sched_class dl_sched_class = {
.set_cpus_allowed   = set_cpus_allowed_dl,
.rq_online  = rq_online_dl,
.rq_offl

[PATCH for v3.18.25 2/6] sched: Allow balance callbacks for check_class_changed()

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

In order to remove dropping rq->lock from the
switched_{to,from}()/prio_changed() sched_class methods, run the
balance callbacks after it.

We need to remove dropping rq->lock because its buggy,
suppose using sched_setattr()/sched_setscheduler() to change a running
task from FIFO to OTHER.

By the time we get to switched_from_rt() the task is already enqueued
on the cfs runqueues. If switched_from_rt() does pull_rt_task() and
drops rq->lock, load-balancing can come in and move our task @p to
another rq.

The subsequent switched_to_fair() still assumes @p is on @rq and bad
things will happen.

By using balance callbacks we delay the load-balancing operations
{rt,dl}x{push,pull} until we've done all the important work and the
task is fully set up.

Furthermore, the balance callbacks do not know about @p, therefore
they cannot get confused like this.

Reported-by: Mike Galbraith 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Link: http://lkml.kernel.org/r/20150611124742.615343...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/core.c
---
 kernel/sched/core.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3dff0ef..ca6dad6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -999,6 +999,13 @@ inline int task_curr(const struct task_struct *p)
return cpu_curr(task_cpu(p)) == p;
 }
 
+/*
+ * switched_from, switched_to and prio_changed must _NOT_ drop rq->lock,
+ * use the balance_callback list if you want balancing.
+ *
+ * this means any call to check_class_changed() must be followed by a call to
+ * balance_callback().
+ */
 static inline void check_class_changed(struct rq *rq, struct task_struct *p,
   const struct sched_class *prev_class,
   int oldprio)
@@ -1485,8 +1492,12 @@ ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int 
wake_flags)
 
p->state = TASK_RUNNING;
 #ifdef CONFIG_SMP
-   if (p->sched_class->task_woken)
+   if (p->sched_class->task_woken) {
+   /*
+* XXX can drop rq->lock; most likely ok.
+*/
p->sched_class->task_woken(rq, p);
+   }
 
if (rq->idle_stamp) {
u64 delta = rq_clock(rq) - rq->idle_stamp;
@@ -3032,7 +3043,11 @@ void rt_mutex_setprio(struct task_struct *p, int prio)
 
check_class_changed(rq, p, prev_class, oldprio);
 out_unlock:
+   preempt_disable(); /* avoid rq from going away on us */
__task_rq_unlock(rq);
+
+   balance_callback(rq);
+   preempt_enable();
 }
 #endif
 
@@ -3559,10 +3574,17 @@ change:
}
 
check_class_changed(rq, p, prev_class, oldprio);
+   preempt_disable(); /* avoid rq from going away on us */
task_rq_unlock(rq, p, &flags);
 
rt_mutex_adjust_pi(p);
 
+   /*
+* Run balance callbacks after we've adjusted the PI chain.
+*/
+   balance_callback(rq);
+   preempt_enable();
+
return 0;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.18.25 3/6] sched,rt: Remove return value from pull_rt_task()

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

In order to be able to use pull_rt_task() from a callback, we need to
do away with the return value.

Since the return value indicates if we should reschedule, do this
inside the function. Since not all callers currently do this, this can
increase the number of reschedules due rt balancing.

Too many reschedules is not a correctness issues, too few are.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.679002...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/rt.c
---
 kernel/sched/rt.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 5a91237..ce807aa 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -244,7 +244,7 @@ int alloc_rt_sched_group(struct task_group *tg, struct 
task_group *parent)
 
 #ifdef CONFIG_SMP
 
-static int pull_rt_task(struct rq *this_rq);
+static void pull_rt_task(struct rq *this_rq);
 
 static inline bool need_pull_rt_task(struct rq *rq, struct task_struct *prev)
 {
@@ -399,9 +399,8 @@ static inline bool need_pull_rt_task(struct rq *rq, struct 
task_struct *prev)
return false;
 }
 
-static inline int pull_rt_task(struct rq *this_rq)
+static inline void pull_rt_task(struct rq *this_rq)
 {
-   return 0;
 }
 
 static inline void queue_push_tasks(struct rq *rq)
@@ -1757,14 +1756,15 @@ static void push_rt_tasks(struct rq *rq)
;
 }
 
-static int pull_rt_task(struct rq *this_rq)
+static void pull_rt_task(struct rq *this_rq)
 {
-   int this_cpu = this_rq->cpu, ret = 0, cpu;
+   int this_cpu = this_rq->cpu, cpu;
+   bool resched = false;
struct task_struct *p;
struct rq *src_rq;
 
if (likely(!rt_overloaded(this_rq)))
-   return 0;
+   return;
 
/*
 * Match the barrier from rt_set_overloaded; this guarantees that if we
@@ -1821,7 +1821,7 @@ static int pull_rt_task(struct rq *this_rq)
if (p->prio < src_rq->curr->prio)
goto skip;
 
-   ret = 1;
+   resched = true;
 
deactivate_task(src_rq, p, 0);
set_task_cpu(p, this_cpu);
@@ -1837,7 +1837,8 @@ skip:
double_unlock_balance(this_rq, src_rq);
}
 
-   return ret;
+   if (resched)
+   resched_curr(this_rq);
 }
 
 /*
@@ -1933,8 +1934,7 @@ static void switched_from_rt(struct rq *rq, struct 
task_struct *p)
if (!p->on_rq || rq->rt.rt_nr_running)
return;
 
-   if (pull_rt_task(rq))
-   resched_curr(rq);
+   pull_rt_task(rq);
 }
 
 void __init init_sched_rt_class(void)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.18.25 5/6] sched,dl: Remove return value from pull_dl_task()

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

In order to be able to use pull_dl_task() from a callback, we need to
do away with the return value.

Since the return value indicates if we should reschedule, do this
inside the function. Since not all callers currently do this, this can
increase the number of reschedules due rt balancing.

Too many reschedules is not a correctness issues, too few are.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.859398...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/deadline.c
---
 kernel/sched/deadline.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 3bdf558..822b94f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -252,9 +252,8 @@ static inline bool need_pull_dl_task(struct rq *rq, struct 
task_struct *prev)
return false;
 }
 
-static inline int pull_dl_task(struct rq *rq)
+static inline void pull_dl_task(struct rq *rq)
 {
-   return 0;
 }
 
 static inline void queue_push_tasks(struct rq *rq)
@@ -974,7 +973,7 @@ static void check_preempt_equal_dl(struct rq *rq, struct 
task_struct *p)
resched_curr(rq);
 }
 
-static int pull_dl_task(struct rq *this_rq);
+static void pull_dl_task(struct rq *this_rq);
 
 #endif /* CONFIG_SMP */
 
@@ -1397,15 +1396,16 @@ static void push_dl_tasks(struct rq *rq)
;
 }
 
-static int pull_dl_task(struct rq *this_rq)
+static void pull_dl_task(struct rq *this_rq)
 {
-   int this_cpu = this_rq->cpu, ret = 0, cpu;
+   int this_cpu = this_rq->cpu, cpu;
struct task_struct *p;
+   bool resched = false;
struct rq *src_rq;
u64 dmin = LONG_MAX;
 
if (likely(!dl_overloaded(this_rq)))
-   return 0;
+   return;
 
/*
 * Match the barrier from dl_set_overloaded; this guarantees that if we
@@ -1460,7 +1460,7 @@ static int pull_dl_task(struct rq *this_rq)
   src_rq->curr->dl.deadline))
goto skip;
 
-   ret = 1;
+   resched = true;
 
deactivate_task(src_rq, p, 0);
set_task_cpu(p, this_cpu);
@@ -1473,7 +1473,8 @@ skip:
double_unlock_balance(this_rq, src_rq);
}
 
-   return ret;
+   if (resched)
+   resched_curr(this_rq);
 }
 
 /*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/5] perf tools: Add document for dynamic sort keys

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 12:03:45PM +0900, Namhyung Kim wrote:
> Signed-off-by: Namhyung Kim 

Acked-by: Jiri Olsa 

thanks,
jirka

> ---
>  tools/perf/Documentation/perf-report.txt | 24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/tools/perf/Documentation/perf-report.txt 
> b/tools/perf/Documentation/perf-report.txt
> index ae7cd91727f6..8a301f6afb37 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -117,6 +117,30 @@ OPTIONS
>   And default sort keys are changed to comm, dso_from, symbol_from, dso_to
>   and symbol_to, see '--branch-stack'.
>  
> + If the data file has tracepoint event(s), following (dynamic) sort keys
> + are also available:
> + trace, trace_fields, [.][/raw]
> +
> + - trace: pretty printed trace output in a single column
> + - trace_fields: fields in tracepoints in separate columns
> + - : optional event and field name for a specific field
> +
> + The last form consists of event and field names.  If event name is
> + omitted, it searches all events for matching field name.  The matched
> + field will be shown only for the event has the field.  The event name
> + supports substring match so user doesn't need to specify full subsystem
> + and event name everytime.  For example, 'sched:sched_switch' event can
> + be shortened to 'switch' as long as it's not ambiguous.  Also event can
> + be specified by its index (starting from 1) preceded by the '%'.
> + So '%1' is the first event, '%2' is the second, and so on.
> +
> + The field name can have '/raw' suffix which disables pretty printing
> + and shows raw field value like hex numbers.  The --raw-trace option
> + has the same effect for all dynamic sort keys.
> +
> + The default sort keys are changed to 'trace' if all events in the data
> + file are tracepoint.
> +
>  -F::
>  --fields=::
>   Specify output field - multiple keys can be specified in CSV format.
> -- 
> 2.6.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.18.25 6/6] sched, dl: Convert switched_{from, to}_dl() / prio_changed_dl() to balance callbacks

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

Remove the direct {push,pull} balancing operations from
switched_{from,to}_dl() / prio_changed_dl() and use the balance
callback queue.

Again, err on the side of too many reschedules; since too few is a
hard bug while too many is just annoying.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.968262...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/deadline.c
---
 kernel/sched/deadline.c | 42 +++---
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 1242e5b..6762024 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -213,16 +213,23 @@ static inline bool need_pull_dl_task(struct rq *rq, 
struct task_struct *prev)
return dl_task(prev);
 }
 
-static DEFINE_PER_CPU(struct callback_head, dl_balance_head);
+static DEFINE_PER_CPU(struct callback_head, dl_push_head);
+static DEFINE_PER_CPU(struct callback_head, dl_pull_head);
 
 static void push_dl_tasks(struct rq *);
+static void pull_dl_task(struct rq *);
 
 static inline void queue_push_tasks(struct rq *rq)
 {
if (!has_pushable_dl_tasks(rq))
return;
 
-   queue_balance_callback(rq, &per_cpu(dl_balance_head, rq->cpu), 
push_dl_tasks);
+   queue_balance_callback(rq, &per_cpu(dl_push_head, rq->cpu), 
push_dl_tasks);
+}
+
+static inline void queue_pull_task(struct rq *rq)
+{
+   queue_balance_callback(rq, &per_cpu(dl_pull_head, rq->cpu), 
pull_dl_task);
 }
 
 #else
@@ -259,6 +266,10 @@ static inline void pull_dl_task(struct rq *rq)
 static inline void queue_push_tasks(struct rq *rq)
 {
 }
+
+static inline void queue_pull_task(struct rq *rq)
+{
+}
 #endif /* CONFIG_SMP */
 
 static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags);
@@ -975,8 +986,6 @@ static void check_preempt_equal_dl(struct rq *rq, struct 
task_struct *p)
resched_curr(rq);
 }
 
-static void pull_dl_task(struct rq *this_rq);
-
 #endif /* CONFIG_SMP */
 
 /*
@@ -1586,7 +1595,7 @@ static void switched_from_dl(struct rq *rq, struct 
task_struct *p)
 * from an overloaded cpu, if any.
 */
if (!rq->dl.dl_nr_running)
-   pull_dl_task(rq);
+   queue_pull_task(rq);
 #endif
 }
 
@@ -1596,8 +1605,6 @@ static void switched_from_dl(struct rq *rq, struct 
task_struct *p)
  */
 static void switched_to_dl(struct rq *rq, struct task_struct *p)
 {
-   int check_resched = 1;
-
/*
 * If p is throttled, don't consider the possibility
 * of preempting rq->curr, the check will be done right
@@ -1608,16 +1615,14 @@ static void switched_to_dl(struct rq *rq, struct 
task_struct *p)
 
if (task_on_rq_queued(p) && rq->curr != p) {
 #ifdef CONFIG_SMP
-   if (rq->dl.overloaded && push_dl_task(rq) && rq != task_rq(p))
-   /* Only reschedule if pushing failed */
-   check_resched = 0;
+   if (rq->dl.overloaded)
+   queue_push_tasks(rq);
+#else
+   if (dl_task(rq->curr))
+   check_preempt_curr_dl(rq, p, 0);
+   else
+   resched_curr(rq);
 #endif /* CONFIG_SMP */
-   if (check_resched) {
-   if (dl_task(rq->curr))
-   check_preempt_curr_dl(rq, p, 0);
-   else
-   resched_curr(rq);
-   }
}
 }
 
@@ -1637,15 +1642,14 @@ static void prio_changed_dl(struct rq *rq, struct 
task_struct *p,
 * or lowering its prio, so...
 */
if (!rq->dl.overloaded)
-   pull_dl_task(rq);
+   queue_pull_task(rq);
 
/*
 * If we now have a earlier deadline task than p,
 * then reschedule, provided p is still on this
 * runqueue.
 */
-   if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline) &&
-   rq->curr == p)
+   if (dl_time_before(rq->dl.earliest_dl.curr, p->dl.deadline))
resched_curr(rq);
 #else
/*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH for v3.18.25 4/6] sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to balance callbacks

2016-01-05 Thread Byungchul Park
From: Peter Zijlstra 

Remove the direct {push,pull} balancing operations from
switched_{from,to}_rt() / prio_changed_rt() and use the balance
callback queue.

Again, err on the side of too many reschedules; since too few is a
hard bug while too many is just annoying.

Signed-off-by: Peter Zijlstra (Intel) 
Cc: ktk...@parallels.com
Cc: rost...@goodmis.org
Cc: juri.le...@gmail.com
Cc: pang.xun...@linaro.org
Cc: o...@redhat.com
Cc: wanpeng...@linux.intel.com
Cc: umgwanakikb...@gmail.com
Link: http://lkml.kernel.org/r/20150611124742.766832...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Byungchul Park 

Conflicts:
kernel/sched/rt.c
---
 kernel/sched/rt.c | 35 +++
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index ce807aa..fe0399f 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -338,16 +338,23 @@ static inline int has_pushable_tasks(struct rq *rq)
return !plist_head_empty(&rq->rt.pushable_tasks);
 }
 
-static DEFINE_PER_CPU(struct callback_head, rt_balance_head);
+static DEFINE_PER_CPU(struct callback_head, rt_push_head);
+static DEFINE_PER_CPU(struct callback_head, rt_pull_head);
 
 static void push_rt_tasks(struct rq *);
+static void pull_rt_task(struct rq *);
 
 static inline void queue_push_tasks(struct rq *rq)
 {
if (!has_pushable_tasks(rq))
return;
 
-   queue_balance_callback(rq, &per_cpu(rt_balance_head, rq->cpu), 
push_rt_tasks);
+   queue_balance_callback(rq, &per_cpu(rt_push_head, rq->cpu), 
push_rt_tasks);
+}
+
+static inline void queue_pull_task(struct rq *rq)
+{
+   queue_balance_callback(rq, &per_cpu(rt_pull_head, rq->cpu), 
pull_rt_task);
 }
 
 static void enqueue_pushable_task(struct rq *rq, struct task_struct *p)
@@ -1934,7 +1941,7 @@ static void switched_from_rt(struct rq *rq, struct 
task_struct *p)
if (!p->on_rq || rq->rt.rt_nr_running)
return;
 
-   pull_rt_task(rq);
+   queue_pull_task(rq);
 }
 
 void __init init_sched_rt_class(void)
@@ -1955,8 +1962,6 @@ void __init init_sched_rt_class(void)
  */
 static void switched_to_rt(struct rq *rq, struct task_struct *p)
 {
-   int check_resched = 1;
-
/*
 * If we are already running, then there's nothing
 * that needs to be done. But if we are not running
@@ -1966,13 +1971,12 @@ static void switched_to_rt(struct rq *rq, struct 
task_struct *p)
 */
if (p->on_rq && rq->curr != p) {
 #ifdef CONFIG_SMP
-   if (p->nr_cpus_allowed > 1 && rq->rt.overloaded &&
-   /* Don't resched if we changed runqueues */
-   push_rt_task(rq) && rq != task_rq(p))
-   check_resched = 0;
-#endif /* CONFIG_SMP */
-   if (check_resched && p->prio < rq->curr->prio)
+   if (p->nr_cpus_allowed > 1 && rq->rt.overloaded)
+   queue_push_tasks(rq);
+#else
+   if (p->prio < rq->curr->prio)
resched_curr(rq);
+#endif /* CONFIG_SMP */
}
 }
 
@@ -1993,14 +1997,13 @@ prio_changed_rt(struct rq *rq, struct task_struct *p, 
int oldprio)
 * may need to pull tasks to this runqueue.
 */
if (oldprio < p->prio)
-   pull_rt_task(rq);
+   queue_pull_task(rq);
+
/*
 * If there's a higher priority task waiting to run
-* then reschedule. Note, the above pull_rt_task
-* can release the rq lock and p could migrate.
-* Only reschedule if p is still on the same runqueue.
+* then reschedule.
 */
-   if (p->prio > rq->rt.highest_prio.curr && rq->curr == p)
+   if (p->prio > rq->rt.highest_prio.curr)
resched_curr(rq);
 #else
/* For UP simply resched on drop of prio */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/5] perf tools: Support dynamic sort keys for -F/--fields

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 12:03:46PM +0900, Namhyung Kim wrote:
> Now dynamic sort keys are supported for tracepoint events, add it to
> output fields too.

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 4/6] clk: mediatek: Add MT2701 clock support

2016-01-05 Thread Philipp Zabel
Hi James,

Am Dienstag, den 05.01.2016, 14:30 +0800 schrieb James Liao:
> From: Shunli Wang 
> 
> Add MT2701 clock support, include topckgen, apmixedsys,
> infracfg, pericfg and subsystem clocks.
> 
> Signed-off-by: Shunli Wang 
> Signed-off-by: James Liao 
> ---
>  drivers/clk/mediatek/Kconfig  |8 +
>  drivers/clk/mediatek/Makefile |1 +
>  drivers/clk/mediatek/clk-gate.c   |   56 ++
>  drivers/clk/mediatek/clk-gate.h   |2 +
>  drivers/clk/mediatek/clk-mt2701.c | 1210 
> +
>  drivers/clk/mediatek/clk-mtk.c|   25 +
>  drivers/clk/mediatek/clk-mtk.h|   35 +-
>  7 files changed, 1334 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/clk/mediatek/clk-mt2701.c
> 
> diff --git a/drivers/clk/mediatek/Kconfig b/drivers/clk/mediatek/Kconfig
> index dc224e6..6c7cdc0 100644
> --- a/drivers/clk/mediatek/Kconfig
> +++ b/drivers/clk/mediatek/Kconfig
> @@ -6,6 +6,14 @@ config COMMON_CLK_MEDIATEK
>   ---help---
> Mediatek SoCs' clock support.
>  
> +config COMMON_CLK_MT2701
> + bool "Clock driver for Mediatek MT2701 and MT7623"
> + depends on COMMON_CLK
> + select COMMON_CLK_MEDIATEK
> + default ARCH_MEDIATEK
> + ---help---
> +   This driver supports Mediatek MT2701 and MT7623 clocks.
> +
>  config COMMON_CLK_MT8135
>   bool "Clock driver for Mediatek MT8135"
>   depends on COMMON_CLK
> diff --git a/drivers/clk/mediatek/Makefile b/drivers/clk/mediatek/Makefile
> index 32e7222..5b2b91b 100644
> --- a/drivers/clk/mediatek/Makefile
> +++ b/drivers/clk/mediatek/Makefile
> @@ -1,4 +1,5 @@
>  obj-$(CONFIG_COMMON_CLK_MEDIATEK) += clk-mtk.o clk-pll.o clk-gate.o 
> clk-apmixed.o
>  obj-$(CONFIG_RESET_CONTROLLER) += reset.o
> +obj-$(CONFIG_COMMON_CLK_MT2701) += clk-mt2701.o
>  obj-$(CONFIG_COMMON_CLK_MT8135) += clk-mt8135.o
>  obj-$(CONFIG_COMMON_CLK_MT8173) += clk-mt8173.o
> diff --git a/drivers/clk/mediatek/clk-gate.c b/drivers/clk/mediatek/clk-gate.c
> index 576bdb7..38badb4 100644
> --- a/drivers/clk/mediatek/clk-gate.c
> +++ b/drivers/clk/mediatek/clk-gate.c
> @@ -61,6 +61,26 @@ static void mtk_cg_clr_bit(struct clk_hw *hw)
>   regmap_write(cg->regmap, cg->clr_ofs, BIT(cg->bit));
>  }
>  
> +static void mtk_cg_set_bit_no_setclr(struct clk_hw *hw)
> +{
> + struct mtk_clk_gate *cg = to_clk_gate(hw);
> + u32 val;
> +
> + regmap_read(cg->regmap, cg->sta_ofs, &val);
> + val |= BIT(cg->bit);
> + regmap_write(cg->regmap, cg->sta_ofs, val);

You can use regmap_update_bits here:

u32 bit = BIT(cg->bit);
regmap_update_bits(cg->regmap, cg->sta_ofs, bit, bit);

> +}
> +
> +static void mtk_cg_clr_bit_no_setclr(struct clk_hw *hw)
> +{
> + struct mtk_clk_gate *cg = to_clk_gate(hw);
> + u32 val;
> +
> + regmap_read(cg->regmap, cg->sta_ofs, &val);
> + val &= ~(BIT(cg->bit));
> + regmap_write(cg->regmap, cg->sta_ofs, val);

and here:

u32 bit = BIT(cg->bit);
regmap_update_bits(cg->regmap, cg->sta_ofs, bit, 0);

best regards
Philipp

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote:
> On Mon, 4 Jan 2016 22:18:58 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote:
> > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> > > > This defines __smp_xxx barriers for s390,
> > > > for use by virtualization.
> > > > 
> > > > Some smp_xxx barriers are removed as they are
> > > > defined correctly by asm-generic/barriers.h
> > > > 
> > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
> > > > unconditionally on this architecture.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin 
> > > > Acked-by: Arnd Bergmann 
> > > > ---
> > > >  arch/s390/include/asm/barrier.h | 15 +--
> > > >  1 file changed, 9 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/arch/s390/include/asm/barrier.h 
> > > > b/arch/s390/include/asm/barrier.h
> > > > index c358c31..fbd25b2 100644
> > > > --- a/arch/s390/include/asm/barrier.h
> > > > +++ b/arch/s390/include/asm/barrier.h
> > > > @@ -26,18 +26,21 @@
> > > >  #define wmb()  barrier()
> > > >  #define dma_rmb()  mb()
> > > >  #define dma_wmb()  mb()
> > > > -#define smp_mb()   mb()
> > > > -#define smp_rmb()  rmb()
> > > > -#define smp_wmb()  wmb()
> > > > -
> > > > -#define smp_store_release(p, v)
> > > > \
> > > > +#define __smp_mb() mb()
> > > > +#define __smp_rmb()rmb()
> > > > +#define __smp_wmb()wmb()
> > > > +#define smp_mb()   __smp_mb()
> > > > +#define smp_rmb()  __smp_rmb()
> > > > +#define smp_wmb()  __smp_wmb()
> > > 
> > > Why define the smp_*mb() primitives here? Would not the inclusion of
> > > asm-generic/barrier.h do this?
> > 
> > No because the generic one is a nop on !SMP, this one isn't.
> > 
> > Pls note this patch is just reordering code without making
> > functional changes.
> > And at the moment, on s390 smp_xxx barriers are always non empty.
> 
> The s390 kernel is SMP to 99.99%, we just didn't bother with a
> non-smp variant for the memory-barriers. If the generic header
> is used we'd get the non-smp version for free. It will save a
> small amount of text space for CONFIG_SMP=n. 

OK, so I'll queue a patch to do this then?

Just to make sure: the question would be, are smp_xxx barriers ever used
in s390 arch specific code to flush in/out memory accesses for
synchronization with the hypervisor?

I went over s390 arch code and it seems to me the answer is no
(except of course for virtio).

But I also see a lot of weirdness on this architecture.

I found these calls:

arch/s390/include/asm/bitops.h: smp_mb__before_atomic();
arch/s390/include/asm/bitops.h: smp_mb();

Not used in arch specific code so this is likely OK.

arch/s390/kernel/vdso.c:smp_mb();

Looking at
Author: Christian Borntraeger 
Date:   Fri Sep 11 16:23:06 2015 +0200

s390/vdso: use correct memory barrier

By definition smp_wmb only orders writes against writes. (Finish all
previous writes, and do not start any future write). To protect the
vdso init code against early reads on other CPUs, let's use a full
smp_mb at the end of vdso init. As right now smp_wmb is implemented
as full serialization, this needs no stable backport, but this 
change
will be necessary if we reimplement smp_wmb.

ok from hypervisor point of view, but it's also strange:
1. why isn't this paired with another mb somewhere?
   this seems to violate barrier pairing rules.
2. how does smp_mb protect against early reads on other CPUs?
   It normally does not: it orders reads from this CPU versus writes
   from same CPU. But init code does not appear to read anything.
   Maybe this is some s390 specific trick?

I could not figure out the above commit.


arch/s390/kvm/kvm-s390.c:   smp_mb();

Does not appear to be paired with anything.


arch/s390/lib/spinlock.c:   smp_mb();
arch/s390/lib/spinlock.c:   smp_mb();

Seems ok, and appears paired properly.
Just to make sure - spinlock is not paravirtualized on s390, is it?

rch/s390/kernel/time.c:smp_wmb();
arch/s390/kernel/time.c:smp_wmb();
arch/s390/kernel/time.c:smp_wmb();
arch/s390/kernel/time.c:smp_wmb();

It's all around vdso, so I'm guessing userspace is using this,
this is why there's no pairing.



> > Some of this could be sub-optimal, but
> > since on s390 Linux always runs on a hypervisor,
> > I am not sure it's safe to use the generic version -
> > in other words, it just might be that for s390 smp_ and virt_
> > barriers must be equivalent.
> 
> The definition of the memory barriers is independent from the fact
> i

Re: [PATCH v2 5/6] reset: mediatek: mt2701 reset controller dt-binding file

2016-01-05 Thread Philipp Zabel
Am Dienstag, den 05.01.2016, 14:30 +0800 schrieb James Liao:
> From: Shunli Wang 
> 
> Dt-binding file about reset controller is used to provide
> kinds of definition, which is referenced by dts file and
> IC-specified reset controller driver code.
> 
> Signed-off-by: Shunli Wang 
> ---
>  .../dt-bindings/reset-controller/mt2701-resets.h   | 74 
> ++
>  1 file changed, 74 insertions(+)
>  create mode 100644 include/dt-bindings/reset-controller/mt2701-resets.h
> 
> diff --git a/include/dt-bindings/reset-controller/mt2701-resets.h 
> b/include/dt-bindings/reset-controller/mt2701-resets.h

No new files in include/dt-bindings/reset-controller, please.
This should go into include/dt-bindings/reset.

regards
Philipp

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 6/6] reset: mediatek: mt2701 reset driver

2016-01-05 Thread Philipp Zabel
Am Dienstag, den 05.01.2016, 14:30 +0800 schrieb James Liao:
> From: Shunli Wang 
> 
> In infrasys and perifsys, there are many reset
> control bits for kinds of modules. These bits are
> used as actual reset controllers to be registered
> into kernel's generic reset controller framework.
> 
> Signed-off-by: Shunli Wang 

Acked-by: Philipp Zabel 

regards
Philipp

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/5] perf evlist: Add -T/--trace option to show trace fields

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 12:03:47PM +0900, Namhyung Kim wrote:
> To use dynamic sort keys, it might be good to add an option to see the
> list of field names.
> 
>   $ perf evlist -T -i perf.data.sched
>   sched:sched_switch: 
> trace_fields=prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
>   sched:sched_stat_wait: trace_fields=comm,pid,delay
>   sched:sched_stat_sleep: trace_fields=comm,pid,delay
>   sched:sched_stat_iowait: trace_fields=comm,pid,delay
>   sched:sched_stat_runtime: trace_fields=comm,pid,runtime,vruntime
>   sched:sched_process_fork: 
> trace_fields=parent_comm,parent_pid,child_comm,child_pid
>   sched:sched_wakeup: trace_fields=comm,pid,prio,success,target_cpu
>   sched:sched_wakeup_new: trace_fields=comm,pid,prio,success,target_cpu
>   sched:sched_migrate_task: trace_fields=comm,pid,prio,orig_cpu,dest_cpu
> 
> Signed-off-by: Namhyung Kim 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH 1/3] f2fs: check the page status filled from disk

2016-01-05 Thread Chao Yu
Hi Jaegeuk,

> -Original Message-
> From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> Sent: Sunday, January 03, 2016 9:26 AM
> To: linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org;
> linux-f2fs-de...@lists.sourceforge.net
> Cc: Jaegeuk Kim
> Subject: [f2fs-dev] [PATCH 1/3] f2fs: check the page status filled from disk
> 
> After reading a page, we need to check whether there is any error.
> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/data.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 89a978c..11b2111 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -448,6 +448,14 @@ repeat:
> 
>   /* wait for read completion */
>   lock_page(page);
> + if (unlikely(!PageUptodate(page))) {
> + f2fs_put_page(page, 1);
> + return ERR_PTR(-EIO);

There is a convention in get_new_data_page, anyway we should release ipage
if there is any error occurs, but I think it will be ok to return directly
since it seems impossible the new dentry page has its real block address.

To avoid any bug here or wrong usage, how about add bug_on as following patch?

>From d92f0f34493b27ef28da67c446d552ce721b5d6f Mon Sep 17 00:00:00 2001
From: Chao Yu 
Date: Tue, 5 Jan 2016 15:28:56 +0800
Subject: [PATCH] f2fs: add f2fs_bug_on in get_new_data_page

In get_new_data_page, locked inode page should not be hold before
get_read_data_page, this patch adds f2fs_bug_on to detect this
condition.

Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 48f0bd3..2c5e3f6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -440,6 +440,8 @@ repeat:
zero_user_segment(page, 0, PAGE_CACHE_SIZE);
SetPageUptodate(page);
} else {
+   f2fs_bug_on(F2FS_I_SB(inode), ipage);
+
f2fs_put_page(page, 1);
 
page = get_read_data_page(inode, index, READ_SYNC, true);
-- 
2.6.3


> + }
> + if (unlikely(page->mapping != mapping)) {
> + f2fs_put_page(page, 1);
> + goto repeat;
> + }

How about use get_lock_data_page to avoid duplicated code?

>   }
>  got_it:
>   if (new_i_size && i_size_read(inode) <
> --
> 2.6.3
> 
> 
> --
> ___
> Linux-f2fs-devel mailing list
> linux-f2fs-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [f2fs-dev] [PATCH 2/3] f2fs: cover more area with nat_tree_lock

2016-01-05 Thread Chao Yu
Hi Jaegeuk,

> -Original Message-
> From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> Sent: Sunday, January 03, 2016 9:26 AM
> To: linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org;
> linux-f2fs-de...@lists.sourceforge.net
> Cc: Jaegeuk Kim
> Subject: [f2fs-dev] [PATCH 2/3] f2fs: cover more area with nat_tree_lock
> 
> There was a subtle bug on nat cache management which incurs wrong nid 
> allocation
> or wrong block addresses when try_to_free_nats is triggered heavily.
> This patch enlarges the previous coverage of nat_tree_lock to avoid data race.

Have you figured out how this happen? I'm curious about this issue,
since still I can't reproduce it and find any clue by reviewing code
so far.

Thanks,

> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/node.c | 29 -
>  1 file changed, 12 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 669c44e..4dab09f 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -262,13 +262,11 @@ static void cache_nat_entry(struct f2fs_nm_info *nm_i, 
> nid_t nid,
>  {
>   struct nat_entry *e;
> 
> - down_write(&nm_i->nat_tree_lock);
>   e = __lookup_nat_cache(nm_i, nid);
>   if (!e) {
>   e = grab_nat_entry(nm_i, nid);
>   node_info_from_raw_nat(&e->ni, ne);
>   }
> - up_write(&nm_i->nat_tree_lock);
>  }
> 
>  static void set_node_addr(struct f2fs_sb_info *sbi, struct node_info *ni,
> @@ -380,6 +378,8 @@ void get_node_info(struct f2fs_sb_info *sbi, nid_t nid, 
> struct node_info
> *ni)
> 
>   memset(&ne, 0, sizeof(struct f2fs_nat_entry));
> 
> + down_write(&nm_i->nat_tree_lock);
> +
>   /* Check current segment summary */
>   mutex_lock(&curseg->curseg_mutex);
>   i = lookup_journal_in_cursum(sum, NAT_JOURNAL, nid, 0);
> @@ -400,6 +400,7 @@ void get_node_info(struct f2fs_sb_info *sbi, nid_t nid, 
> struct node_info
> *ni)
>  cache:
>   /* cache nat entry */
>   cache_nat_entry(NM_I(sbi), nid, &ne);
> + up_write(&nm_i->nat_tree_lock);
>  }
> 
>  /*
> @@ -1459,13 +1460,10 @@ static int add_free_nid(struct f2fs_sb_info *sbi, 
> nid_t nid, bool build)
> 
>   if (build) {
>   /* do not add allocated nids */
> - down_read(&nm_i->nat_tree_lock);
>   ne = __lookup_nat_cache(nm_i, nid);
> - if (ne &&
> - (!get_nat_flag(ne, IS_CHECKPOINTED) ||
> + if (ne && (!get_nat_flag(ne, IS_CHECKPOINTED) ||
>   nat_get_blkaddr(ne) != NULL_ADDR))
>   allocated = true;
> - up_read(&nm_i->nat_tree_lock);
>   if (allocated)
>   return 0;
>   }
> @@ -1551,6 +1549,8 @@ static void build_free_nids(struct f2fs_sb_info *sbi)
>   ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nid), FREE_NID_PAGES,
>   META_NAT, true);
> 
> + down_read(&nm_i->nat_tree_lock);
> +
>   while (1) {
>   struct page *page = get_current_nat_page(sbi, nid);
> 
> @@ -1579,6 +1579,7 @@ static void build_free_nids(struct f2fs_sb_info *sbi)
>   remove_free_nid(nm_i, nid);
>   }
>   mutex_unlock(&curseg->curseg_mutex);
> + up_read(&nm_i->nat_tree_lock);
> 
>   ra_meta_pages(sbi, NAT_BLOCK_OFFSET(nm_i->next_scan_nid),
>   nm_i->ra_nid_pages, META_NAT, false);
> @@ -1861,14 +1862,12 @@ static void remove_nats_in_journal(struct 
> f2fs_sb_info *sbi)
> 
>   raw_ne = nat_in_journal(sum, i);
> 
> - down_write(&nm_i->nat_tree_lock);
>   ne = __lookup_nat_cache(nm_i, nid);
>   if (!ne) {
>   ne = grab_nat_entry(nm_i, nid);
>   node_info_from_raw_nat(&ne->ni, &raw_ne);
>   }
>   __set_nat_cache_dirty(nm_i, ne);
> - up_write(&nm_i->nat_tree_lock);
>   }
>   update_nats_in_cursum(sum, -i);
>   mutex_unlock(&curseg->curseg_mutex);
> @@ -1902,7 +1901,6 @@ static void __flush_nat_entry_set(struct f2fs_sb_info 
> *sbi,
>   struct f2fs_nat_block *nat_blk;
>   struct nat_entry *ne, *cur;
>   struct page *page = NULL;
> - struct f2fs_nm_info *nm_i = NM_I(sbi);
> 
>   /*
>* there are two steps to flush nat entries:
> @@ -1939,12 +1937,8 @@ static void __flush_nat_entry_set(struct f2fs_sb_info 
> *sbi,
>   raw_ne = &nat_blk->entries[nid - start_nid];
>   }
>   raw_nat_from_node_info(raw_ne, &ne->ni);
> -
> - down_write(&NM_I(sbi)->nat_tree_lock);
>   nat_reset_flag(ne);
>   __clear_nat_cache_dirty(NM_I(sbi), ne);
> - up_write(&NM_I(sbi)->nat_tree_lock);
> -
>   if (nat_get_blkaddr(ne) == NULL_ADDR)
>   add_free_nid(sbi, nid, false);
>   }
> @@ -1956,9 +1950,7 @@ static void __flush_nat_entry_set(s

Re: [PATCH] Staging: speakup: Fix getting port information

2016-01-05 Thread Dan Carpenter
On Tue, Jan 05, 2016 at 02:19:12AM +0100, Samuel Thibault wrote:
> --- a/drivers/staging/speakup/serialio.c
> +++ b/drivers/staging/speakup/serialio.c
> @@ -6,6 +6,9 @@
>  #include "spk_priv.h"
>  #include "serialio.h"
>  
> +#include 
> +#include 
> +

I'm sorry to do this but can you add a comment here, otherwise someone
is going to just change it back because it causes a checkpatch.pl
warning.  Make it a big ugly warning.

#include 
/* WARNING:  Do not change this to  without testing. */
#include 

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin
On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> >> The two mechanisms referenced above would likely require coordination with
> >> QEMU and as such are open to discussion.  I haven't attempted to address
> >> them as I am not sure there is a consensus as of yet.  My personal
> >> preference would be to add a vendor-specific configuration block to the
> >> emulated pci-bridge interfaces created by QEMU that would allow us to
> >> essentially extend shpc to support guest live migration with pass-through
> >> devices.
> >
> > shpc?
> 
> That is kind of what I was thinking.  We basically need some mechanism
> to allow for the host to ask the device to quiesce.  It has been
> proposed to possibly even look at something like an ACPI interface
> since I know ACPI is used by QEMU to manage hot-plug in the standard
> case.
> 
> - Alex


Start by using hot-unplug for this!

Really use your patch guest side, and write host side
to allow starting migration with the device, but
defer completing it.

So

1.- host tells guest to start tracking memory writes
2.- guest acks
3.- migration starts
4.- most memory is migrated
5.- host tells guest to eject device
6.- guest acks
7.- stop vm and migrate rest of state


It will already be a win since hot unplug after migration starts and
most memory has been migrated is better than hot unplug before migration
starts.

Then measure downtime and profile. Then we can look at ways
to quiesce device faster which really means step 5 is replaced
with "host tells guest to quiesce device and dirty (or just unmap!)
all memory mapped for write by device".

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 2/7] dax: support dirty DAX entries in radix tree

2016-01-05 Thread Jan Kara
On Wed 23-12-15 12:39:15, Ross Zwisler wrote:
> Add support for tracking dirty DAX entries in the struct address_space
> radix tree.  This tree is already used for dirty page writeback, and it
> already supports the use of exceptional (non struct page*) entries.
> 
> In order to properly track dirty DAX pages we will insert new exceptional
> entries into the radix tree that represent dirty DAX PTE or PMD pages.
> These exceptional entries will also contain the writeback sectors for the
> PTE or PMD faults that we can use at fsync/msync time.
> 
> There are currently two types of exceptional entries (shmem and shadow)
> that can be placed into the radix tree, and this adds a third.  We rely on
> the fact that only one type of exceptional entry can be found in a given
> radix tree based on its usage.  This happens for free with DAX vs shmem but
> we explicitly prevent shadow entries from being added to radix trees for
> DAX mappings.
> 
> The only shadow entries that would be generated for DAX radix trees would
> be to track zero page mappings that were created for holes.  These pages
> would receive minimal benefit from having shadow entries, and the choice
> to have only one type of exceptional entry in a given radix tree makes the
> logic simpler both in clear_exceptional_entry() and in the rest of DAX.
> 
> Signed-off-by: Ross Zwisler 

The patch looks good to me. You can add:

Reviewed-by: Jan Kara 

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm: kernel: utilize hrtimer based broadcast

2016-01-05 Thread Thomas Gleixner
On Sat, 2 Jan 2016, Russell King - ARM Linux wrote:
> On Tue, Dec 29, 2015 at 02:54:10PM +0100, Thomas Gleixner wrote:
> > I have no real opinion about that patch. It does no harm to unconditionally
> > setup the hrtimer based broadcast even if it's never used.
> > 
> > Up to the arch maintainer to decide. 
> 
> That's really not fair to keep shovelling these kinds of decisions onto
> architecture maintainers without any kind of explanation about how an
> architecture maintainer should make such a decision.
> 
> Do I roll a 6-face dice, and if it gives an odd number, I apply this
> patch, otherwise I reject it?
> 
> Is there a technical basis for making the decision?  If so, please
> explain what the technical arguments are against having or not having
> this change.

The hrtimer based broadcast device is used when you have per cpu timers which
stop in deeper power states, but you have no other timer hardware on the chip
which can backup the per cpu timer in deep power states. The trick is that it
emulates a timer hardware via a hrtimer and then tells the cpu idle code not
to go into deep power states on the cpu which owns that hrtimer. All other
cpus can go as deep as they want and still get woken up.

The only downside of adding this unconditionally is extra code in case that it
is not needed on a particular platform.

Hope that helps.

 tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 3/3] thermal: improve hot trip handling

2016-01-05 Thread Geert Uytterhoeven
Hi Eduardo,

On Thu, Dec 17, 2015 at 8:13 PM, Eduardo Valentin  wrote:
> The idea is to add the choice to be notified only when temperature
> crosses trip points. The trip points affected are the non-passive
> trip points.
>
> It will check last temperature and current temperature against
> the trip point temperature and its hysteresis.
> In case the check shows temperature has changed enought indicating
> a trip point crossing, a uevent will be sent to userspace.
>
> The uevent contains the thermal zone type, the current temperature,
> the last temperature and the trip point in which the current temperature
> now resides.
>
> The behavior of ops->notify() callback remains the same.
>
> Cc: Zhang Rui 
> Cc: linux...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Eduardo Valentin 
> ---
> V1->V2: none
> ---
>  drivers/thermal/thermal_core.c | 52 
> ++
>  1 file changed, 52 insertions(+)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index a229c84..e0f1f4e 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -423,6 +423,56 @@ static void handle_non_critical_trips(struct 
> thermal_zone_device *tz,
>def_governor->throttle(tz, trip);
>  }
>
> +static void thermal_tripped_notify(struct thermal_zone_device *tz,
> +  int trip, enum thermal_trip_type trip_type,
> +  int trip_temp)
> +{
> +   char tuv_name[THERMAL_NAME_LENGTH + 15], tuv_temp[25],
> +   tuv_ltemp[25], tuv_trip[25], tuv_type[25];
> +   char *msg[6] = { tuv_name, tuv_temp, tuv_ltemp, tuv_trip, tuv_type,
> +   NULL };
> +   int upper_trip_hyst, upper_trip_temp, trip_hyst = 0;
> +   int ret = 0;
> +
> +   snprintf(tuv_name, sizeof(tuv_name), "THERMAL_ZONE=%s", tz->type);
> +   snprintf(tuv_temp, sizeof(tuv_temp), "TEMP=%d", tz->temperature);
> +   snprintf(tuv_ltemp, sizeof(tuv_ltemp), "LAST_TEMP=%d",
> +tz->last_temperature);
> +   snprintf(tuv_trip, sizeof(tuv_trip), "TRIP=%d", trip);
> +   snprintf(tuv_type, sizeof(tuv_type), "TRIP_TYPE=%d", trip_type);
> +
> +   mutex_lock(&tz->lock);
> +
> +   /* crossing up */
> +   if (tz->last_temperature < trip_temp && trip_temp < tz->temperature)
> +   kobject_uevent_env(&tz->device.kobj, KOBJ_CHANGE, msg);
> +
> +   if (tz->ops->get_trip_hyst)
> +   tz->ops->get_trip_hyst(tz, trip, &trip_hyst);
> +
> +   /* crossing down, check for hyst */
> +   trip_temp -= trip_hyst;
> +   if (tz->last_temperature > trip_temp && trip_temp > tz->temperature) {
> +   snprintf(tuv_trip, sizeof(tuv_trip), "TRIP=%d", trip - 1);
> +   kobject_uevent_env(&tz->device.kobj, KOBJ_CHANGE, msg);
> +   }
> +
> +   ret = tz->ops->get_trip_temp(tz, trip + 1, &upper_trip_temp);

"trip + 1" may be equal to thermal_zone_device.trips and thus out-of-range,
in which case rcar_thermal_get_trip_temp() will print an error message:

rcar_thermal e61f.thermal: rcar driver trip error

Is the "+ 1" (also below) intentional?
If yes, I think the related error messages in rcar_thermal.c should be reduced
to debug messages.

> +   if (ret)
> +   goto unlock;
> +
> +   if (tz->ops->get_trip_hyst)
> +   tz->ops->get_trip_hyst(tz, trip + 1, &upper_trip_hyst);

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rsi: Delete unnecessary variable initialisations in rsi_send_mgmt_pkt()

2016-01-05 Thread Julian Calaby
Hi Markus,

On Tue, Jan 5, 2016 at 7:29 PM, SF Markus Elfring
 wrote:
>> That said, if you figure out some change that produces significant
>> reductions in code or binary size on multiple architectures without
>> making things more complicated, less readable or making the code or
>> binary size larger, then by all means propose it.
>
> Are you looking also for "a proof" that such changes are worthwhile?

It'd be better than "I think doing things this way is better", which
is the hallmark of most of your patch sets. (Admittedly not this one,
but this one is where the discussion is now, so that's where we're
discussing it.)

>> "This makes things smaller" carries much more weight than
>> "I think this is better".
>
> Can the discussed implementation of a function like "rsi_send_mgmt_pkt"
> become a bit smaller by the deletion of extra variable initialisations

I'm talking in general.

In this case you're asking people to review a patch which requires a
lot of careful review for a fairly minor improvement. I must also note
that you haven't CC'd the people who wrote this driver, so it's
possible that the only people who have reviewed it aren't experts in
the code.

The patches you sent recently which moved labels into if statements
were a clear case of "I think this is better" where any actual benefit
from the changes was eclipsed by the style and readability issues they
introduced.

>> Almost all of the changes you've proposed that have seen any
>> discussion whatsoever fall into the latter category.
>
> Thanks for your interesting feedback.

No problem.

> Can a further constructive dialogue evolve from the presented information?

Part of the issue here is that you don't seem to be listening to the
discussion of your patches, or if you are, you're not significantly
changing your approach or attitude in response.

Every time you send a set of patches, there are legitimate issues
which people raise, and every time they are discussed, you assert that
your patches improve things and seem to ignore the concerns people
raise.

I've seen this same pattern of discussion here with these patches,
with your patches to move labels into if statements, with the patches
you sent late June last year, your patches to remove conditions before
kfree() and friends, etc.

You need to change you attitude: just because you can see some benefit
from your patches doesn't mean others do and it doesn't mean that
they're willing to accept them.

Thanks,

-- 
Julian Calaby

Email: julian.cal...@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/3] clocksource/vt8500: Increase the minimum delta

2016-01-05 Thread Roman Volkov
В Tue, 5 Jan 2016 10:01:07 +0100
Daniel Lezcano  пишет:

> On 01/01/2016 02:24 PM, Roman Volkov wrote:
> > From: Roman Volkov 
> >
> > The vt8500 clocksource driver declares itself as capable to handle
> > the minimum delay of 4 cycles by passing the value into
> > clockevents_config_and_register(). The vt8500_timer_set_next_event()
> > requires the passed cycles value to be at least 16. The impact is
> > that userspace hangs in nanosleep() calls with small delay
> > intervals.
> >
> > This problem is reproducible in Linux 4.2 starting from:
> > c6eb3f70d448 ('hrtimer: Get rid of hrtimer softirq')
> >
> > Signed-off-by: Roman Volkov 
> > Acked-by: Alexey Charkov   
> 
> Hi Roman,
> 
> I looked at the email thread, and IIUC if set_next_event fails, the 
> system freeze. Your patch fixes the issue for your driver but not the 
> real issue because if set_next_event fails, at least a warning should 
> appear in the log or better nanosleep should fail gracefully.

Hi Daniel,

I agree, but if nanosleep will return immediately, this can lead to
undefined behavior in the software. Maybe the system can go busyloop
to somehow recover from this state and print a message to the log? At
the driver level it seems to be enough to fail the function without
printing logs.
 
> BTW why min delta is MIN_OSCR_DELTA * 2 in
> clockevents_config_and_register ?

All this just to be consistent with PXA. Maybe PXA works with lesser
values, e.g., 8. For vt8500, accessing the registers is more complex,
and this should consume more time. IIUC, if the driver does not support
too small delays, the system will handle it with busyloop?

Why multiply by two? Good question. Maybe there is a reserve for
stability. The value passed by the system to the set_next_event() should
be not lesser than this value, and theoretically, we should not
multiply MIN_OSCR_DELTA by two. As I can see, in many drivers there is
no such minimal values at all.

Added Robert

Regards,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 01/11] arm-cci: Define CCI counter period

2016-01-05 Thread Suzuki K. Poulose

On 04/01/16 18:27, Mark Rutland wrote:

On Mon, Jan 04, 2016 at 11:54:40AM +, Suzuki K. Poulose wrote:

Instead of hard coding the period we program on the PMU
counters, define a symbol.




-   u64 val = 1ULL << 31;
-   local64_set(&hwc->prev_count, val);
-   pmu_write_counter(event, val);
+   local64_set(&hwc->prev_count, CCI_CNTR_PERIOD);
+   pmu_write_counter(event, CCI_CNTR_PERIOD);


I think this is a little misleading (and confusing), as we're conflating
the period with its inverse. This wouldn't work for any other value of
CCI_CNTR_PERIOD.

Perhaps s/PERIOD/START_VAL/, leaving everything else as-is?


You are right, will change it.

Cheers
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V5 9/9] hvsock: introduce Hyper-V VM Sockets feature

2016-01-05 Thread Vitaly Kuznetsov
Dexuan Cui  writes:

Just some minor nitpicks below -- I have to admit I didn't test the feature.

[..skip..] 

> +
> + if (sk->sk_err) {
> + ret = -sk->sk_err;
> + goto out_wait_error;
> + } else {
> + ret = 0;
> + }
> +
> +out_wait:
> + finish_wait(sk_sleep(sk), &wait);
> +out:
> + release_sock(sk);
> + return ret;
> +
> +out_wait_error:
> + sk->sk_state = SS_UNCONNECTED;
> + sock->state = SS_UNCONNECTED;
> + goto out_wait;
> +}

Why not just place out_wait_error label before out_wait (and do 'goto
out_wait' in ret = 0 case instead of 'goto out_wait_error' in the error
case)?

[..skip..]

> +
> +static int __init hvsock_init(void)
> +{
> + int ret;
> +
> + /* Hyper-V socket requires at least VMBus 4.0 */
> + if ((vmbus_proto_version >> 16) < 4) {
> + pr_err("failed to load: VMBus 4 or later is required\n");
> + return -ENODEV;

(Let me pretend I'm Dan :-) So here we return ...

> + }
> +
> + ret = vmbus_driver_register(&hvsock_drv);
> + if (ret) {
> + pr_err("failed to register hv_sock driver\n");
> + goto out;

... and here we goto where we just return. I suggest we bring some
consistency by directly returning ret here and eliminating 'out' label. 

> + }
> +
> + ret = proto_register(&hvsock_proto, 0);
> + if (ret) {
> + pr_err("failed to register protocol\n");
> + goto unreg_hvsock_drv;
> + }
> +
> + ret = sock_register(&hvsock_family_ops);
> + if (ret) {
> + pr_err("failed to register address family\n");
> + goto unreg_proto;
> + }
> +
> + return 0;
> +
> +unreg_proto:
> + proto_unregister(&hvsock_proto);
> +unreg_hvsock_drv:
> + vmbus_driver_unregister(&hvsock_drv);
> +out:
> + return ret;
> +}
> +
> +static void __exit hvsock_exit(void)
> +{
> + sock_unregister(AF_HYPERV);
> + proto_unregister(&hvsock_proto);
> + vmbus_driver_unregister(&hvsock_drv);
> +}
> +
> +module_init(hvsock_init);
> +module_exit(hvsock_exit);
> +
> +MODULE_DESCRIPTION("Microsoft Hyper-V Virtual Socket Family");
> +MODULE_VERSION("0.1");

Do we really need it? When the driver is commited we won't probably be
updating it with v0.2 as a whole, we'll be sending patches addressing
issues and there always will be a question when to swtich to 0.2, 0.3,
... And we don't have MODULE_VERSION for other Hyper-V drivers.

> +MODULE_LICENSE("Dual BSD/GPL");

-- 
  Vitaly
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 15/32] powerpc: define __smp_xxx

2016-01-05 Thread Boqun Feng
On Tue, Jan 05, 2016 at 10:51:17AM +0200, Michael S. Tsirkin wrote:
> On Tue, Jan 05, 2016 at 09:36:55AM +0800, Boqun Feng wrote:
> > Hi Michael,
> > 
> > On Thu, Dec 31, 2015 at 09:07:42PM +0200, Michael S. Tsirkin wrote:
> > > This defines __smp_xxx barriers for powerpc
> > > for use by virtualization.
> > > 
> > > smp_xxx barriers are removed as they are
> > > defined correctly by asm-generic/barriers.h
> 
> I think this is the part that was missed in review.
> 

Yes, I realized my mistake after reread the series. But smp_lwsync() is
not defined in asm-generic/barriers.h, right?

> > > This reduces the amount of arch-specific boiler-plate code.
> > > 
> > > Signed-off-by: Michael S. Tsirkin 
> > > Acked-by: Arnd Bergmann 
> > > ---
> > >  arch/powerpc/include/asm/barrier.h | 24 
> > >  1 file changed, 8 insertions(+), 16 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/include/asm/barrier.h 
> > > b/arch/powerpc/include/asm/barrier.h
> > > index 980ad0c..c0deafc 100644
> > > --- a/arch/powerpc/include/asm/barrier.h
> > > +++ b/arch/powerpc/include/asm/barrier.h
> > > @@ -44,19 +44,11 @@
> > >  #define dma_rmb()__lwsync()
> > >  #define dma_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > : :"memory")
> > >  
> > > -#ifdef CONFIG_SMP
> > > -#define smp_lwsync() __lwsync()
> > > +#define __smp_lwsync()   __lwsync()
> > >  
> > 
> > so __smp_lwsync() is always mapped to lwsync, right?
> 
> Yes.
> 
> > > -#define smp_mb() mb()
> > > -#define smp_rmb()__lwsync()
> > > -#define smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > : :"memory")
> > > -#else
> > > -#define smp_lwsync() barrier()
> > > -
> > > -#define smp_mb() barrier()
> > > -#define smp_rmb()barrier()
> > > -#define smp_wmb()barrier()
> > > -#endif /* CONFIG_SMP */
> > > +#define __smp_mb()   mb()
> > > +#define __smp_rmb()  __lwsync()
> > > +#define __smp_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > : :"memory")
> > >  
> > >  /*
> > >   * This is a barrier which prevents following instructions from being
> > > @@ -67,18 +59,18 @@
> > >  #define data_barrier(x)  \
> > >   asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
> > >  
> > > -#define smp_store_release(p, v)  
> > > \
> > > +#define __smp_store_release(p, v)
> > > \
> > >  do { 
> > > \
> > >   compiletime_assert_atomic_type(*p); \
> > > - smp_lwsync();   \
> > > + __smp_lwsync(); \
> > 
> > , therefore this will emit an lwsync no matter SMP or UP.
> 
> Absolutely. But smp_store_release (without __) will not.
> 
> Please note I did test this: for ppc code before and after
> this patch generates exactly the same binary on SMP and UP.
> 

Yes, you're right, sorry for my mistake...

> 
> > Another thing is that smp_lwsync() may have a third user(other than
> > smp_load_acquire() and smp_store_release()):
> > 
> > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877
> > 
> > I'm OK to change my patch accordingly, but do we really want
> > smp_lwsync() get involved in this cleanup? If I understand you
> > correctly, this cleanup focuses on external API like smp_{r,w,}mb(),
> > while smp_lwsync() is internal to PPC.
> > 
> > Regards,
> > Boqun
> 
> I think you missed the leading ___ :)
> 

What I mean here was smp_lwsync() was originally internal to PPC, but
never mind ;-)

> smp_store_release is external and it needs __smp_lwsync as
> defined here.
> 
> I can duplicate some code and have smp_lwsync *not* call __smp_lwsync

You mean bringing smp_lwsync() back? because I haven't seen you defining
in asm-generic/barriers.h in previous patches and you just delete it in
this patch.

> but why do this? Still, if you prefer it this way,
> please let me know.
> 

I think deleting smp_lwsync() is fine, though I need to change atomic
variants patches on PPC because of it ;-/

Regards,
Boqun

> > >   WRITE_ONCE(*p, v);  \
> > >  } while (0)
> > >  
> > > -#define smp_load_acquire(p)  
> > > \
> > > +#define __smp_load_acquire(p)
> > > \
> > >  ({   
> > > \
> > >   typeof(*p) ___p1 = READ_ONCE(*p);   \
> > >   compiletime_assert_atomic_type(*p); \
> > > - smp_lwsync();   \
> > > + __smp_lwsync(); \
> > >   ___p1;  \
> > >  })
> > >  
> > > -- 
> > > MST
> > > 
> > > --
> > > To u

Re: [PATCH] lightnvm: add full block direct to the gc list

2016-01-05 Thread Wenwei Tao
You are right, a deadlock might occur if interrupt is not disabled.

We might add the block to prio_list when we find the block is full in
rrpc_alloc_addr and check whether all the writes are complete in
rrpc_lun_gc, in this way we may avoid gcb allocation fail and irq
disable issues.

But this still has a problem. We allocate page from block before
write, but the bio submission may fail, the bio never get execute and
rrpc_end_io never get called on this bio, this may lead to a
situation: a block's pages are all allocated, but not all of them are
used. So this block is not fully used now, and will not get reclaimed
for further use.

I think we may need to put the page back when the page is not actually
used/programmed.

2016-01-04 19:24 GMT+08:00 Matias Bjørling :
> On 01/04/2016 10:54 AM, Wenwei Tao wrote:
>>
>> We allocate gcb to queue full block to the gc list,
>> but gcb allocation may fail, if that happens, the
>> block will not get reclaimed. So add the full block
>> direct to the gc list, omit the queuing step.
>>
>> Signed-off-by: Wenwei Tao 
>> ---
>>   drivers/lightnvm/rrpc.c | 47
>> ++-
>>   1 file changed, 10 insertions(+), 37 deletions(-)
>>
>> diff --git a/drivers/lightnvm/rrpc.c b/drivers/lightnvm/rrpc.c
>> index 40b0309..27fb98d 100644
>> --- a/drivers/lightnvm/rrpc.c
>> +++ b/drivers/lightnvm/rrpc.c
>> @@ -475,24 +475,6 @@ static void rrpc_lun_gc(struct work_struct *work)
>> /* TODO: Hint that request queue can be started again */
>>   }
>>
>> -static void rrpc_gc_queue(struct work_struct *work)
>> -{
>> -   struct rrpc_block_gc *gcb = container_of(work, struct
>> rrpc_block_gc,
>> -
>> ws_gc);
>> -   struct rrpc *rrpc = gcb->rrpc;
>> -   struct rrpc_block *rblk = gcb->rblk;
>> -   struct nvm_lun *lun = rblk->parent->lun;
>> -   struct rrpc_lun *rlun = &rrpc->luns[lun->id - rrpc->lun_offset];
>> -
>> -   spin_lock(&rlun->lock);
>> -   list_add_tail(&rblk->prio, &rlun->prio_list);
>> -   spin_unlock(&rlun->lock);
>> -
>> -   mempool_free(gcb, rrpc->gcb_pool);
>> -   pr_debug("nvm: block '%lu' is full, allow GC (sched)\n",
>> -   rblk->parent->id);
>> -}
>> -
>>   static const struct block_device_operations rrpc_fops = {
>> .owner  = THIS_MODULE,
>>   };
>> @@ -620,39 +602,30 @@ err:
>> return NULL;
>>   }
>>
>> -static void rrpc_run_gc(struct rrpc *rrpc, struct rrpc_block *rblk)
>> -{
>> -   struct rrpc_block_gc *gcb;
>> -
>> -   gcb = mempool_alloc(rrpc->gcb_pool, GFP_ATOMIC);
>> -   if (!gcb) {
>> -   pr_err("rrpc: unable to queue block for gc.");
>> -   return;
>> -   }
>> -
>> -   gcb->rrpc = rrpc;
>> -   gcb->rblk = rblk;
>> -
>> -   INIT_WORK(&gcb->ws_gc, rrpc_gc_queue);
>> -   queue_work(rrpc->kgc_wq, &gcb->ws_gc);
>> -}
>> -
>>   static void rrpc_end_io_write(struct rrpc *rrpc, struct rrpc_rq *rrqd,
>> sector_t laddr, uint8_t
>> npages)
>>   {
>> struct rrpc_addr *p;
>> struct rrpc_block *rblk;
>> struct nvm_lun *lun;
>> +   struct rrpc_lun *rlun;
>> int cmnt_size, i;
>>
>> for (i = 0; i < npages; i++) {
>> p = &rrpc->trans_map[laddr + i];
>> rblk = p->rblk;
>> lun = rblk->parent->lun;
>> +   rlun = &rrpc->luns[lun->id - rrpc->lun_offset];
>>
>> cmnt_size = atomic_inc_return(&rblk->data_cmnt_size);
>> -   if (unlikely(cmnt_size == rrpc->dev->pgs_per_blk))
>> -   rrpc_run_gc(rrpc, rblk);
>> +   if (unlikely(cmnt_size == rrpc->dev->pgs_per_blk)) {
>> +   pr_debug("nvm: block '%lu' is full, allow GC
>> (sched)\n",
>> +   rblk->parent->id);
>> +   spin_lock(&rlun->lock);
>
>
> A deadlock might occur, as the lock can be called from interrupt context.
> The other ->rlun usages will have to be converted to
> spinlock_irqsave/spinlock_irqrestore to be valid.
>
> The reason for the queueing is that the ->rlun lock is held for a while in
> rrpc_lun_gc. Therefore, it rather takes the queueing overhead, than disable
> interrupts on the CPU for the duration of the ->prio sorting and selection
> of victim block. My assumptions about this optimization might be premature.
> So I like to be proved wrong.
>
>
>> +   list_add_tail(&rblk->prio, &rlun->prio_list);
>> +   spin_unlock(&rlun->lock);
>> +
>> +   }
>> }
>>   }
>>
>>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH v2] perf test: Improve bp_signal

2016-01-05 Thread Wang Nan
Will Deacon [1] has some question on patch [2]. This patch improves
test__bp_signal so we can test:

 1. A watchpoint and a breakpoint that fire on the same instruction
 2. Nested signals

Test result:

 On x86_64 and ARM64 (result are similar with patch [2] on ARM64):

 # ./perf test -v signal
 17: Test breakpoint overflow signal handler  :
 --- start ---
 test child forked, pid 10213
 count1 1, count2 3, count3 2, overflow 3, overflows_2 3
 test child finished with 0
  end 
 Test breakpoint overflow signal handler: Ok

So at least 2 cases Will doubted are handled correctly.

[1] http://lkml.kernel.org/g/20160104165535.gi1...@arm.com
[2] 
http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangn...@huawei.com

Signed-off-by: Wang Nan 
Signed-off-by: Jiri Olsa 
Cc: Will Deacon 
Cc: Arnaldo Carvalho de Melo 
---

v1 -> v2: Improve readability, fix typo. Thanks to Jiri Olsa.

To Jiri: I guess you will be okay to provide your SOB for your code at [3],
 so I add it in this v2 patch.

[3] http://lkml.kernel.org/g/20160105090030.gc2...@krava.brq.redhat.com

---
 tools/perf/tests/bp_signal.c | 140 ---
 1 file changed, 118 insertions(+), 22 deletions(-)

diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index fb80c9e..1d1bb48 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -29,14 +29,59 @@
 
 static int fd1;
 static int fd2;
+static int fd3;
 static int overflows;
+static int overflows_2;
+
+volatile long the_var;
+
+
+/*
+ * Use ASM to ensure watchpoint and breakpoint can be triggered
+ * at one instruction.
+ */
+#if defined (__x86_64__)
+extern void __test_function(volatile long *ptr);
+asm (
+   ".globl __test_function\n"
+   "__test_function:\n"
+   "incq (%rdi)\n"
+   "ret\n");
+#elif defined (__aarch64__)
+extern void __test_function(volatile long *ptr);
+asm (
+   ".globl __test_function\n"
+   "__test_function:\n"
+   "str x30, [x0]\n"
+   "ret\n");
+
+#else
+static void __test_function(volatile long *ptr)
+{
+   *ptr = 0x1234;
+}
+#endif
 
 __attribute__ ((noinline))
 static int test_function(void)
 {
+   __test_function(&the_var);
+   the_var++;
return time(NULL);
 }
 
+static void sig_handler_2(int signum __maybe_unused,
+ siginfo_t *oh __maybe_unused,
+ void *uc __maybe_unused)
+{
+   overflows_2++;
+   if (overflows_2 > 10) {
+   ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
+   ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+   ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
+   }
+}
+
 static void sig_handler(int signum __maybe_unused,
siginfo_t *oh __maybe_unused,
void *uc __maybe_unused)
@@ -54,10 +99,11 @@ static void sig_handler(int signum __maybe_unused,
 */
ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0);
ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0);
+   ioctl(fd3, PERF_EVENT_IOC_DISABLE, 0);
}
 }
 
-static int bp_event(void *fn, int setup_signal)
+static int __event(bool is_x, void *addr, int signal)
 {
struct perf_event_attr pe;
int fd;
@@ -67,8 +113,8 @@ static int bp_event(void *fn, int setup_signal)
pe.size = sizeof(struct perf_event_attr);
 
pe.config = 0;
-   pe.bp_type = HW_BREAKPOINT_X;
-   pe.bp_addr = (unsigned long) fn;
+   pe.bp_type = is_x ? HW_BREAKPOINT_X : HW_BREAKPOINT_W;
+   pe.bp_addr = (unsigned long) addr;
pe.bp_len = sizeof(long);
 
pe.sample_period = 1;
@@ -86,17 +132,25 @@ static int bp_event(void *fn, int setup_signal)
return TEST_FAIL;
}
 
-   if (setup_signal) {
-   fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
-   fcntl(fd, F_SETSIG, SIGIO);
-   fcntl(fd, F_SETOWN, getpid());
-   }
+   fcntl(fd, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
+   fcntl(fd, F_SETSIG, signal);
+   fcntl(fd, F_SETOWN, getpid());
 
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
 
return fd;
 }
 
+static int bp_event(void *addr, int signal)
+{
+   return __event(true, addr, signal);
+}
+
+static int wp_event(void *addr, int signal)
+{
+   return __event(false, addr, signal);
+}
+
 static long long bp_count(int fd)
 {
long long count;
@@ -114,7 +168,7 @@ static long long bp_count(int fd)
 int test__bp_signal(int subtest __maybe_unused)
 {
struct sigaction sa;
-   long long count1, count2;
+   long long count1, count2, count3;
 
/* setup SIGIO signal handler */
memset(&sa, 0, sizeof(struct sigaction));
@@ -126,21 +180,52 @@ int test__bp_signal(int subtest __maybe_unused)
return TEST_FAIL;
}
 
+   sa.sa_sigaction = (void *) sig_handler_2;
+   if (sigaction(SIGUSR1, &sa, NULL) < 0) {
+   pr_debug("failed setting

Re: [PATCH v2 4/6] clk: mediatek: Add MT2701 clock support

2016-01-05 Thread James Liao
Hi Philipp,

On Tue, 2016-01-05 at 10:30 +0100, Philipp Zabel wrote:
> Hi James,
> 
> Am Dienstag, den 05.01.2016, 14:30 +0800 schrieb James Liao:
> > From: Shunli Wang 
> > 
> > Add MT2701 clock support, include topckgen, apmixedsys,
> > infracfg, pericfg and subsystem clocks.
> > 
> > Signed-off-by: Shunli Wang 
> > Signed-off-by: James Liao 
> > ---
> >  drivers/clk/mediatek/Kconfig  |8 +
> >  drivers/clk/mediatek/Makefile |1 +
> >  drivers/clk/mediatek/clk-gate.c   |   56 ++
> >  drivers/clk/mediatek/clk-gate.h   |2 +
> >  drivers/clk/mediatek/clk-mt2701.c | 1210 
> > +
> >  drivers/clk/mediatek/clk-mtk.c|   25 +
> >  drivers/clk/mediatek/clk-mtk.h|   35 +-
> >  7 files changed, 1334 insertions(+), 3 deletions(-)
> >  create mode 100644 drivers/clk/mediatek/clk-mt2701.c
> > 
> > diff --git a/drivers/clk/mediatek/Kconfig b/drivers/clk/mediatek/Kconfig
> > index dc224e6..6c7cdc0 100644
> > --- a/drivers/clk/mediatek/Kconfig
> > +++ b/drivers/clk/mediatek/Kconfig
> > @@ -6,6 +6,14 @@ config COMMON_CLK_MEDIATEK
> > ---help---
> >   Mediatek SoCs' clock support.
> >  
> > +config COMMON_CLK_MT2701
> > +   bool "Clock driver for Mediatek MT2701 and MT7623"
> > +   depends on COMMON_CLK
> > +   select COMMON_CLK_MEDIATEK
> > +   default ARCH_MEDIATEK
> > +   ---help---
> > + This driver supports Mediatek MT2701 and MT7623 clocks.
> > +
> >  config COMMON_CLK_MT8135
> > bool "Clock driver for Mediatek MT8135"
> > depends on COMMON_CLK
> > diff --git a/drivers/clk/mediatek/Makefile b/drivers/clk/mediatek/Makefile
> > index 32e7222..5b2b91b 100644
> > --- a/drivers/clk/mediatek/Makefile
> > +++ b/drivers/clk/mediatek/Makefile
> > @@ -1,4 +1,5 @@
> >  obj-$(CONFIG_COMMON_CLK_MEDIATEK) += clk-mtk.o clk-pll.o clk-gate.o 
> > clk-apmixed.o
> >  obj-$(CONFIG_RESET_CONTROLLER) += reset.o
> > +obj-$(CONFIG_COMMON_CLK_MT2701) += clk-mt2701.o
> >  obj-$(CONFIG_COMMON_CLK_MT8135) += clk-mt8135.o
> >  obj-$(CONFIG_COMMON_CLK_MT8173) += clk-mt8173.o
> > diff --git a/drivers/clk/mediatek/clk-gate.c 
> > b/drivers/clk/mediatek/clk-gate.c
> > index 576bdb7..38badb4 100644
> > --- a/drivers/clk/mediatek/clk-gate.c
> > +++ b/drivers/clk/mediatek/clk-gate.c
> > @@ -61,6 +61,26 @@ static void mtk_cg_clr_bit(struct clk_hw *hw)
> > regmap_write(cg->regmap, cg->clr_ofs, BIT(cg->bit));
> >  }
> >  
> > +static void mtk_cg_set_bit_no_setclr(struct clk_hw *hw)
> > +{
> > +   struct mtk_clk_gate *cg = to_clk_gate(hw);
> > +   u32 val;
> > +
> > +   regmap_read(cg->regmap, cg->sta_ofs, &val);
> > +   val |= BIT(cg->bit);
> > +   regmap_write(cg->regmap, cg->sta_ofs, val);
> 
> You can use regmap_update_bits here:
> 
>   u32 bit = BIT(cg->bit);
>   regmap_update_bits(cg->regmap, cg->sta_ofs, bit, bit);
> 
> > +}
> > +
> > +static void mtk_cg_clr_bit_no_setclr(struct clk_hw *hw)
> > +{
> > +   struct mtk_clk_gate *cg = to_clk_gate(hw);
> > +   u32 val;
> > +
> > +   regmap_read(cg->regmap, cg->sta_ofs, &val);
> > +   val &= ~(BIT(cg->bit));
> > +   regmap_write(cg->regmap, cg->sta_ofs, val);
> 
> and here:
> 
>   u32 bit = BIT(cg->bit);
>   regmap_update_bits(cg->regmap, cg->sta_ofs, bit, 0);

OK. I'll change it in next patch. Thanks.


Best regards,

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 05/11] arm-cci PMU: Delay counter writes to pmu_enable

2016-01-05 Thread Suzuki K. Poulose

On 04/01/16 19:24, Mark Rutland wrote:

On Mon, Jan 04, 2016 at 11:54:44AM +, Suzuki K. Poulose wrote:

Delay setting the event periods for enabled events to pmu::pmu_enable().
We mark the event.hw->state PERF_HES_ARCH for the events that we know
have their counts recorded and have been started.


Please add a comment to the code stating exactly what PERF_HES_ARCH
means for the CCI PMU driver, so it's easy to find.



Sure.


+void cci_pmu_update_counters(struct cci_pmu *cci_pmu)
+{
+   int i;
+   unsigned long mask[BITS_TO_LONGS(cci_pmu->num_cntrs)];


I think this can be:

DECLARE_BITMAP(mask, cci_pmu->num_cntrs);


+
+   memset(mask, 0, BITS_TO_LONGS(cci_pmu->num_cntrs) * sizeof(unsigned 
long));


Likewise:

bitmap_zero(mask, cci_pmu->num_cntrs);


OK


+   if (!cci_pmu->hw_events.events[i]) {
+   WARN_ON(1);
+   continue;
+   }
+


if (WARN_ON(!cci_pmu->hw_events.events[i]))
continue;


OK
 

@@ -980,8 +1015,11 @@ static void cci_pmu_start(struct perf_event *event, int 
pmu_flags)
/* Configure the counter unless you are counting a fixed event */
if (!pmu_fixed_hw_idx(cci_pmu, idx))
pmu_set_event(cci_pmu, idx, hwc->config_base);
-
-   pmu_event_set_period(event);
+   /*
+* Mark this counter, so that we can program the
+* counter with the event_period. see cci_pmu_enable()
+*/
+   hwc->state = PERF_HES_ARCH;


Why couldn't we have kept pmu_event_set_period here, and have that set
prev_count and PERF_HES_ARCH?

Then we'd be able to do the same betching for overflow too.


The pmu is not disabled while we are in overflow irq handler. Hence there may
not be a pmu_enable() which would set the period for the counter which
overflowed, if defer the write in that case. Is that assumption wrong ?

Cheers
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/6] reset: mediatek: mt2701 reset controller dt-binding file

2016-01-05 Thread James Liao
Hi Philipp,

On Tue, 2016-01-05 at 10:31 +0100, Philipp Zabel wrote:
> Am Dienstag, den 05.01.2016, 14:30 +0800 schrieb James Liao:
> > From: Shunli Wang 
> > 
> > Dt-binding file about reset controller is used to provide
> > kinds of definition, which is referenced by dts file and
> > IC-specified reset controller driver code.
> > 
> > Signed-off-by: Shunli Wang 
> > ---
> >  .../dt-bindings/reset-controller/mt2701-resets.h   | 74 
> > ++
> >  1 file changed, 74 insertions(+)
> >  create mode 100644 include/dt-bindings/reset-controller/mt2701-resets.h
> > 
> > diff --git a/include/dt-bindings/reset-controller/mt2701-resets.h 
> > b/include/dt-bindings/reset-controller/mt2701-resets.h
> 
> No new files in include/dt-bindings/reset-controller, please.
> This should go into include/dt-bindings/reset.

OK, I'll move it to include/dt-bindings/reset/.


Best regards,

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Dr. David Alan Gilbert
* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > >> The two mechanisms referenced above would likely require coordination 
> > >> with
> > >> QEMU and as such are open to discussion.  I haven't attempted to address
> > >> them as I am not sure there is a consensus as of yet.  My personal
> > >> preference would be to add a vendor-specific configuration block to the
> > >> emulated pci-bridge interfaces created by QEMU that would allow us to
> > >> essentially extend shpc to support guest live migration with pass-through
> > >> devices.
> > >
> > > shpc?
> > 
> > That is kind of what I was thinking.  We basically need some mechanism
> > to allow for the host to ask the device to quiesce.  It has been
> > proposed to possibly even look at something like an ACPI interface
> > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > case.
> > 
> > - Alex
> 
> 
> Start by using hot-unplug for this!
> 
> Really use your patch guest side, and write host side
> to allow starting migration with the device, but
> defer completing it.
> 
> So
> 
> 1.- host tells guest to start tracking memory writes
> 2.- guest acks
> 3.- migration starts
> 4.- most memory is migrated
> 5.- host tells guest to eject device
> 6.- guest acks
> 7.- stop vm and migrate rest of state
> 
> 
> It will already be a win since hot unplug after migration starts and
> most memory has been migrated is better than hot unplug before migration
> starts.
> 
> Then measure downtime and profile. Then we can look at ways
> to quiesce device faster which really means step 5 is replaced
> with "host tells guest to quiesce device and dirty (or just unmap!)
> all memory mapped for write by device".


Doing a hot-unplug is going to upset the guests network stacks view
of the world; that's something we don't want to change.

Dave

> 
> -- 
> MST
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i2c: rk3x: init module as subsys call

2016-01-05 Thread Wolfram Sang

> > Tomeu from Collabora is working on some better scheme to optimize device 
> > probing order but it looks like this may be a bit off still.

...

> I don't just talk about touch screen driver, most i2c device driver such
> as input sensor/camera/rtc/battery will suffer. So people will see their
> drivers do not work or slow down on rk3368 platform :(

I totally agree that the current situation is not ideal. This is why it
has to be *properly fixed* and not workarounded (which caused other side
effects in the past). If you care about it, then please help Tomeu with
his patchset.



signature.asc
Description: Digital signature


Re: [PATCH v3 1/3] clocksource/vt8500: Increase the minimum delta

2016-01-05 Thread Russell King - ARM Linux
On Tue, Jan 05, 2016 at 12:42:42PM +0300, Roman Volkov wrote:
> Why multiply by two? Good question. Maybe there is a reserve for
> stability. The value passed by the system to the set_next_event() should
> be not lesser than this value, and theoretically, we should not
> multiply MIN_OSCR_DELTA by two. As I can see, in many drivers there is
> no such minimal values at all.

It's a speciality of the StrongARM/PXA hardware.  It takes a certain
number of OSCR cycles for the value written to hit the compare registers.
So, if a very small delta is written (eg, the compare register is written
with a value of OSCR + 1), the OSCR will have incremented past this value
before it hits the underlying hardware.  The result is, that you end up
waiting a very long time for the OSCR to wrap before the event fires.

So, we introduce a check in set_next_event() to detect this and return
-ETIME if the calculated delta is too small, which causes the generic
clockevents code to retry after adding the min_delta specified in
clockevents_config_and_register() to the current time value.

min_delta must be sufficient that we don't re-trip the -ETIME check - if
we do, we will return -ETIME, forward the next event time, try to set it,
return -ETIME again, and basically lock the system up.  So, min_delta
must be larger than the check inside set_next_event().  A factor of two
was chosen to ensure that this situation would never occur.

The PXA code worked on PXA systems for years, and I'd suggest no one
changes this mechanism without access to a wide range of PXA systems,
otherwise they're risking breakage.

-- 
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] base/platform: Fix platform drivers with no probe callback (ex alarmtimer)

2016-01-05 Thread Tero Roponen

On Tue, 5 Jan 2016, Uwe Kleine-König wrote:

> Hello,
> 
> I think this is the same problem that another Martin found and fixed in
> 
> http://mid.gmane.org/1449132704-9952-1-git-send-email-martin.wi...@ts.fujitsu.com
> 
> I didn't check, but thought Greg already picked that up?!

I can confirm that applying the patch in that link to 4.4-rc8 fixes the
following problem I've seen since 4.4-rc5:

BUG: unable to handle kernel NULL pointer dereference at 0e20
IP: [] asus_sysfs_is_visible+0xe/0x1d0 [asus_laptop]

So:

Tested-by: Tero Roponen 

Re: [PATCH -next] MIPS: VDSO: Fix build error with binutils 2.24 and earlier

2016-01-05 Thread James Hogan
On Tue, Jan 05, 2016 at 10:20:59AM +0100, Michal Marek wrote:
> On 2015-12-24 13:57, James Hogan wrote:
> > On Thu, Dec 24, 2015 at 12:48:12PM +, James Hogan wrote:
> >> Hi Guenter,
> >>
> >> On Wed, Dec 23, 2015 at 09:04:31PM -0800, Guenter Roeck wrote:
> >>> Commit 2a037f310bab ("MIPS: VDSO: Fix build error") tries to fix a build
> >>> error seen with binutils 2.24 and earlier. However, the fix does not work,
> >>> and again results in the already known build errors if the kernel is built
> >>> with an earlier version of binutils.
> >>>
> >>> CC  arch/mips/vdso/gettimeofday.o
> >>> /tmp/ccnOVbHT.s: Assembler messages:
> >>> /tmp/ccnOVbHT.s:50: Error: can't resolve `_start' {*UND* section} - `L0 
> >>> {.text section}
> >>> /tmp/ccnOVbHT.s:374: Error: can't resolve `_start' {*UND* section} - `L0 
> >>> {.text section}
> >>> scripts/Makefile.build:258: recipe for target 
> >>> 'arch/mips/vdso/gettimeofday.o' failed
> >>> make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
> >>>
> >>> Fixes: 2a037f310bab ("MIPS: VDSO: Fix build error")
> >>> Cc: Qais Yousef 
> >>> Signed-off-by: Guenter Roeck 
> >>> ---
> >>> Tested with binutils 2.25 and 2.22.
> >>>
> >>>  arch/mips/vdso/Makefile | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
> >>> index 018f8c7b94f2..14568900fc1d 100644
> >>> --- a/arch/mips/vdso/Makefile
> >>> +++ b/arch/mips/vdso/Makefile
> >>> @@ -26,7 +26,7 @@ aflags-vdso := $(ccflags-vdso) \
> >>>  # the comments on that file.
> >>>  #
> >>>  ifndef CONFIG_CPU_MIPSR6
> >>> -  ifeq ($(call ld-ifversion, -lt, 2250, y),)
> >>> +  ifeq ($(call ld-ifversion, -lt, 2250, y),y)
> >>
> >> I agree this is semantically correct, but there is something more evil
> >> going on here.
> >>
> >> Originally the check was version <= 2.24
> >> Qais' patch changed it to version >= 2.25 (intending version < 2.25)
> >> Your patch changes it to version < 2.25
> >>
> >> I think the reason this fixed the problem for Qais is actually that he
> >> probably had a similar toolchain version to what I'm using:
> >>
> >> GNU ld (Codescape GNU Tools 2015.06-05 for MIPS MTI Linux) 2.24.90
> >>
> >> ./scripts/ld-version.sh does this:
> >>
> >> print a[1]*1000 + a[2]*10 + a[3]*1 + a[4]*100 + a[5];
> >>
> >> which changes that version number into:
> >>  2000
> >> + 240
> >> +  90 = 2330
> >>
> >> I.e. it doesn't expect a[3] to be >= 10.
> >>
> >> Should we do something like this (increase multipliers on a[1] and
> >> a[2])?:
> >>
> >> diff --git a/scripts/ld-version.sh b/scripts/ld-version.sh
> >> index 198580d245e0..0b67edc5bc6f 100755
> >> --- a/scripts/ld-version.sh
> >> +++ b/scripts/ld-version.sh
> >> @@ -3,6 +3,6 @@
> >>{
> >>gsub(".*)", "");
> >>split($1,a, ".");
> >> -  print a[1]*1000 + a[2]*10 + a[3]*1 + a[4]*100 + a[5];
> >> +  print a[1]*1 + a[2]*100 + a[3]*1 + a[4]*100 + a[5];
> >>exit
> >>}
> >>
> >> which gives 2.24.90 => 22490.
> >>
> >> All call sites would need updating too to add the extra 0, but a quick
> >> git grep isn't showing any other ones than this one.
> > 
> > Actually, linux-next includes this commit which uses ld-ifversion too:
> > 
> > 19a3cc83353e3bb4bc28769f8606139a3d350d2d
> > "Kbuild, lto: Add Link Time Optimization support v3"
> 
> That commit needs updating for other reasons, so feel free to fix
> ld-ifversion and its usage in arch/mips.

Thanks. This change is now in linux-next, and will hopefully be included
in v4.4:
http://patchwork.linux-mips.org/patch/11931/

Cheers
James


signature.asc
Description: Digital signature


4.3.3: error when plugging in USB stick

2016-01-05 Thread Rolf Eike Beer
[199328.874819] [ cut here ]
[199328.874825] WARNING: CPU: 3 PID: 15727 at ../block/genhd.c:626 
add_disk+0x43e/0x480()
[199328.874827] Modules linked in: hid_cherry usb_storage cdc_phonet phonet 
hid_generic usbhid fuse ctr ccm af_packet nf_log_ipv6 xt_pkttype nf_log_ipv4 
nf_log_common xt_LOG xt_limit bnep ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_d
efrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter 
ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast 
nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack 
ip6table_filter ip6_tables x_tab
les drbg ansi_cprng dm_crypt nls_iso8859_1 nls_cp437 vfat fat btusb btintel 
btbcm btrtl bluetooth arc4 joydev rtsx_pci_ms memstick rtsx_pci_sdmmc mmc_core 
iTCO_wdt iTCO_vendor_support iwlmvm mac80211 pcspkr serio_raw iwlwifi cfg8021
1 snd_hda_codec_via rfkill snd_hda_codec_hdmi rtsx_pci snd_hda_codec_generic 
x86_pkg_temp_thermal intel_powerclamp
[199328.874868]  coretemp kvm_intel kvm lpc_ich crct10dif_pclmul crc32_pclmul 
crc32c_intel mfd_core i2c_i801 aesni_intel ablk_helper cryptd lrw gf128mul 
glue_helper aes_x86_64 shpchp wmi fjes xhci_pci xhci_hcd tpm_infineon tpm_tis t
pm snd_hda_intel snd_hda_codec mei_me snd_hda_core e1000e snd_hwdep mei ptp 
snd_pcm pps_core snd_seq snd_timer snd_seq_device snd soundcore sg battery ac 
efivarfs i915 drm_kms_helper ehci_pci ehci_hcd drm usbcore fb_sys_fops sysimgb
lt usb_common sysfillrect syscopyarea i2c_algo_bit thermal video button 
processor scsi_dh_alua scsi_dh_emc scsi_dh_hp_sw scsi_dh_rdac raid456 
async_raid6_recov async_pq async_xor xor async_memcpy async_tx raid6_pq raid10 
raid1 raid0
 md_mod dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_mod
[199328.874916] CPU: 3 PID: 15727 Comm: kworker/u16:2 Not tainted 
4.3.3-2.gdb72752-default #1
[199328.874917] Hardware name: Notebook W740SU  
/W740SU  , BIOS 4.6.5 07/05/2013
[199328.874923] Workqueue: events_unbound async_run_entry_fn
[199328.874925]  81a75772 8800ae937d18 81376259 

[199328.874927]  8800ae937d50 8107afc2 88020d3a4000 
88020d3a4080
[199328.874930]  88020d3a400c 8802145bc9d0 8802172ef480 
8800ae937d60
[199328.874932] Call Trace:
[199328.874941]  [] try_stack_unwind+0x175/0x190
[199328.874948]  [] dump_trace+0x69/0x3a0
[199328.874951]  [] show_trace_log_lvl+0x4b/0x60
[199328.874954]  [] show_stack_log_lvl+0x10c/0x180
[199328.874957]  [] show_stack+0x25/0x50
[199328.874962]  [] dump_stack+0x4b/0x72
[199328.874967]  [] warn_slowpath_common+0x82/0xc0
[199328.874971]  [] warn_slowpath_null+0x1a/0x20
[199328.874973]  [] add_disk+0x43e/0x480
[199328.874980]  [] sd_probe_async+0x115/0x1d0
[199328.874984]  [] async_run_entry_fn+0x48/0x150
[199328.874990]  [] process_one_work+0x159/0x470
[199328.874993]  [] worker_thread+0x48/0x4a0
[199328.874996]  [] kthread+0xc9/0xe0
[199328.875001]  [] ret_from_fork+0x3f/0x70
[199328.876721] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70

[199328.876722] Leftover inexact backtrace:

[199328.876725]  [] ? kthread_worker_fn+0x170/0x170
[199328.876727] ---[ end trace 0335892bcf3ba8c0 ]---


signature.asc
Description: This is a digitally signed message part.


Re: [RFC PATCH v2] perf test: Improve bp_signal

2016-01-05 Thread Jiri Olsa
On Tue, Jan 05, 2016 at 09:57:55AM +, Wang Nan wrote:
> Will Deacon [1] has some question on patch [2]. This patch improves
> test__bp_signal so we can test:
> 
>  1. A watchpoint and a breakpoint that fire on the same instruction
>  2. Nested signals
> 
> Test result:
> 
>  On x86_64 and ARM64 (result are similar with patch [2] on ARM64):
> 
>  # ./perf test -v signal
>  17: Test breakpoint overflow signal handler  :
>  --- start ---
>  test child forked, pid 10213
>  count1 1, count2 3, count3 2, overflow 3, overflows_2 3
>  test child finished with 0
>   end 
>  Test breakpoint overflow signal handler: Ok
> 
> So at least 2 cases Will doubted are handled correctly.
> 
> [1] http://lkml.kernel.org/g/20160104165535.gi1...@arm.com
> [2] 
> http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangn...@huawei.com
> 
> Signed-off-by: Wang Nan 
> Signed-off-by: Jiri Olsa 
> Cc: Will Deacon 
> Cc: Arnaldo Carvalho de Melo 
> ---
> 
> v1 -> v2: Improve readability, fix typo. Thanks to Jiri Olsa.
> 
> To Jiri: I guess you will be okay to provide your SOB for your code at [3],
>  so I add it in this v2 patch.

sure, patch looks good to me

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/3] clocksource/vt8500: Increase the minimum delta

2016-01-05 Thread Daniel Lezcano

On 01/05/2016 10:42 AM, Roman Volkov wrote:

В Tue, 5 Jan 2016 10:01:07 +0100
Daniel Lezcano  пишет:


On 01/01/2016 02:24 PM, Roman Volkov wrote:

From: Roman Volkov 

The vt8500 clocksource driver declares itself as capable to handle
the minimum delay of 4 cycles by passing the value into
clockevents_config_and_register(). The vt8500_timer_set_next_event()
requires the passed cycles value to be at least 16. The impact is
that userspace hangs in nanosleep() calls with small delay
intervals.

This problem is reproducible in Linux 4.2 starting from:
c6eb3f70d448 ('hrtimer: Get rid of hrtimer softirq')

Signed-off-by: Roman Volkov 
Acked-by: Alexey Charkov 


Hi Roman,

I looked at the email thread, and IIUC if set_next_event fails, the
system freeze. Your patch fixes the issue for your driver but not the
real issue because if set_next_event fails, at least a warning should
appear in the log or better nanosleep should fail gracefully.


Hi Daniel,

I agree, but if nanosleep will return immediately, this can lead to
undefined behavior in the software.


The nanosleep syscall is supposed to return an error code. If the 
software does not pay attention to the syscall's return code, then the 
bug is in the software, it is not up to the kernel to work around it.



Maybe the system can go busyloop
to somehow recover from this state and print a message to the log? At
the driver level it seems to be enough to fail the function without
printing logs.


BTW why min delta is MIN_OSCR_DELTA * 2 in
clockevents_config_and_register ?


All this just to be consistent with PXA. Maybe PXA works with lesser
values, e.g., 8. For vt8500, accessing the registers is more complex,
and this should consume more time. IIUC, if the driver does not support
too small delays, the system will handle it with busyloop?


[ Added John Stultz and Thomas Gleixner ] to answer those questions above.


Why multiply by two? Good question. Maybe there is a reserve for
stability. The value passed by the system to the set_next_event() should
be not lesser than this value, and theoretically, we should not
multiply MIN_OSCR_DELTA by two. As I can see, in many drivers there is
no such minimal values at all.

Added Robert

Regards,
Roman




--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] arm64: fix add kasan bug

2016-01-05 Thread Catalin Marinas
On Thu, Dec 31, 2015 at 10:09:09AM +, zhongjiang wrote:
> From: zhong jiang 
> 
> In general, each process have 16kb stack space to use, but
> stack need extra space to store red_zone when kasan enable.
> the patch fix above question.
> 
> Signed-off-by: zhong jiang 
> ---
>  arch/arm64/include/asm/thread_info.h | 15 +--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/thread_info.h 
> b/arch/arm64/include/asm/thread_info.h
> index 90c7ff2..45b5a7e 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
[...]
> +#ifdef CONFIG_KASAN
> +#define THREAD_SIZE  32768
> +#else
>  #define THREAD_SIZE  16384
> +#endif

I'm not really keen on increasing the stack size to 32KB when KASan is
enabled (that's 8 4K pages). Have you actually seen a real problem with
the default size? How large is the red_zone?

With 4.5 we are going for separate IRQ stack on arm64, so the typical
stack overflow case no longer exists.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 5/6] mfd: dt-bindings: add device tree bindings for Hi3519 sysctrl

2016-01-05 Thread Philipp Zabel
Am Mittwoch, den 30.12.2015, 09:43 +0800 schrieb Jiancheng Xue:
> Add device tree bindings for Hi3519 system controller.
> 
> Signed-off-by: Jiancheng Xue 
> ---
>  Documentation/devicetree/bindings/mfd/hi3519.txt | 14 ++
>  1 file changed, 14 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mfd/hi3519.txt
> 
> diff --git a/Documentation/devicetree/bindings/mfd/hi3519.txt 
> b/Documentation/devicetree/bindings/mfd/hi3519.txt
> new file mode 100644
> index 000..2536edc
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mfd/hi3519.txt
> @@ -0,0 +1,14 @@
> +* Hisilicon Hi3519 System Controller Block
> +
> +This bindings use the following binding:
> +Dcumentation/devicetree/bindings/clock/clock-bindings.txt

Typo: "Documentation"
- but I don't see the clock bindings being used here at all.
Maybe just drop this sentence?

> +
> +Required properties:
> +- compatible: "hisilicon,hi3519-sysctrl".
> +- reg: the register region of this block
> +
> +Examples:
> +sysctrl: system-controller@1201 {
> + compatible = "hisilicon,hi3519-sysctrl", "syscon";
> + reg = <0x1201 0x1000>;
> +};

regards
Philipp

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 1/6] clk: hisilicon: add CRG driver for hi3519 soc

2016-01-05 Thread Philipp Zabel
H Jiancheng,

Am Mittwoch, den 30.12.2015, 09:43 +0800 schrieb Jiancheng Xue:
> The CRG(Clock and Reset Generator) block provides clock
> and reset signals for other modules in hi3519 soc.
> 
> Signed-off-by: Jiancheng Xue 
> ---
>  .../devicetree/bindings/clock/hi3519-crg.txt   |  46 +++
>  drivers/clk/hisilicon/Kconfig  |   7 +
>  drivers/clk/hisilicon/Makefile |   2 +
>  drivers/clk/hisilicon/clk-hi3519.c | 103 ++
>  drivers/clk/hisilicon/reset.c  | 149 
> +
>  drivers/clk/hisilicon/reset.h  |  32 +
>  include/dt-bindings/clock/hi3519-clock.h   |  43 ++
>  7 files changed, 382 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/clock/hi3519-crg.txt
>  create mode 100644 drivers/clk/hisilicon/clk-hi3519.c
>  create mode 100644 drivers/clk/hisilicon/reset.c
>  create mode 100644 drivers/clk/hisilicon/reset.h
>  create mode 100644 include/dt-bindings/clock/hi3519-clock.h
> 
> diff --git a/Documentation/devicetree/bindings/clock/hi3519-crg.txt 
> b/Documentation/devicetree/bindings/clock/hi3519-crg.txt
> new file mode 100644
> index 000..2d23950
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/clock/hi3519-crg.txt
> @@ -0,0 +1,46 @@
> +* Hisilicon Hi3519 Clock and Reset Generator(CRG)
> +
> +The Hi3519 CRG module provides clock and reset signals to various
> +controllers within the SoC.
> +
> +This binding uses the following bindings:
> +Documentation/devicetree/bindings/clock/clock-bindings.txt
> +Documentation/devicetree/bindings/reset/reset.txt
> +
> +Required Properties:
> +
> +- compatible: should be one of the following.
> +  - "hisilicon,hi3519-crg" - controller compatible with Hi3519 SoC.
> +
> +- reg: physical base address of the controller and length of memory mapped
> +  region.
> +
> +- #clock-cells: should be 1.
> +
> +Each clock is assigned an identifier and client nodes use this identifier
> +to specify the clock which they consume.
> +
> +All these identifier could be found in .
> +
> +- #reset-cells: should be 2.
> +
> +A reset signal can be controlled by writing a bit register in the CRG module.
> +The reset specifier consists of two cells. The first cell represents the
> +register offset relative to the base address. The second cell represents the
> +bit index in the register.

Are the resets controlled by single bits spread around the register
space? If so, I'm fine with this binding.

> +Example: CRG nodes
> +CRG: clock-reset-controller@1201 {
> + compatible = "hisilicon,hi3519-crg";
> +reg = <0x1201 0x1>;
> +#clock-cells = <1>;
> +#reset-cells = <2>;
> +};
> +
> +Example: consumer nodes
> +i2c0: i2c@1211 {
> + compatible = "hisilicon,hi3519-i2c";
> +reg = <0x1211 0x1000>;
> +clocks = <&CRG HI3519_I2C0_RST>;*/
> +resets = <&CRG 0xe4 0>;
> +};
> diff --git a/drivers/clk/hisilicon/Kconfig b/drivers/clk/hisilicon/Kconfig
> index e434854..b6baebf 100644
> --- a/drivers/clk/hisilicon/Kconfig
> +++ b/drivers/clk/hisilicon/Kconfig
> @@ -1,3 +1,10 @@
> +config COMMON_CLK_HI3519
> + tristate "Clock Driver for Hi3519"
> + depends on ARCH_HISI
> + default y
> + help
> +   Build the clock driver for hi3519.
> +
>  config COMMON_CLK_HI6220
>   bool "Hi6220 Clock Driver"
>   depends on ARCH_HISI || COMPILE_TEST
> diff --git a/drivers/clk/hisilicon/Makefile b/drivers/clk/hisilicon/Makefile
> index 74dba31..3f57b09 100644
> --- a/drivers/clk/hisilicon/Makefile
> +++ b/drivers/clk/hisilicon/Makefile
> @@ -4,8 +4,10 @@
>  
>  obj-y+= clk.o clkgate-separated.o clkdivider-hi6220.o
>  
> +obj-$(CONFIG_RESET_CONTROLLER)   += reset.o
>  obj-$(CONFIG_ARCH_HI3xxx)+= clk-hi3620.o
>  obj-$(CONFIG_ARCH_HIP04) += clk-hip04.o
>  obj-$(CONFIG_ARCH_HIX5HD2)   += clk-hix5hd2.o
>  obj-$(CONFIG_COMMON_CLK_HI6220)  += clk-hi6220.o
>  obj-$(CONFIG_STUB_CLK_HI6220)+= clk-hi6220-stub.o
> +obj-$(CONFIG_COMMON_CLK_HI3519)  += clk-hi3519.o
> diff --git a/drivers/clk/hisilicon/clk-hi3519.c 
> b/drivers/clk/hisilicon/clk-hi3519.c
> new file mode 100644
> index 000..e220234
> --- /dev/null
> +++ b/drivers/clk/hisilicon/clk-hi3519.c
> @@ -0,0 +1,103 @@
> +/*
> + * Copyright (c) 2015 HiSilicon Technologies Co., Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Pu

Re: [PATCH] arm64: fix add kasan bug

2016-01-05 Thread Catalin Marinas
On Mon, Jan 04, 2016 at 01:13:33PM -0800, Andrew Morton wrote:
> On Thu, 31 Dec 2015 18:09:09 +0800 zhongjiang  wrote:
> 
> > From: zhong jiang 
> > 
> > In general, each process have 16kb stack space to use, but
> > stack need extra space to store red_zone when kasan enable.
> > the patch fix above question.
> 
> Thanks.  I grabbed this, but would prefer that the arm64 people handle
> it?

I would also prefer taking such fix via the arm64 tree, though we are
currently still going through the post-holiday email backlog.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] of/platform: export of_default_bus_match_table

2016-01-05 Thread Arnd Bergmann
On Tuesday 05 January 2016 11:17:53 Masahiro Yamada wrote:
> Currently, drivers/bus/uniphier-system-bus.c is kept from being a
> module due to the unresolved reference to of_default_bus_match_table.
> 
> Refer to commit 326ea45aa827 ("bus: uniphier: allow only built-in
> driver").
> 
> Signed-off-by: Masahiro Yamada 
> ---
> 
>  drivers/of/platform.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> index af98343..8d103e4 100644
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -31,6 +31,7 @@ const struct of_device_id of_default_bus_match_table[] = {
>  #endif /* CONFIG_ARM_AMBA */
> {} /* Empty terminated list */
>  };
> +EXPORT_SYMBOL(of_default_bus_match_table);

I wonder if the uniphier bus should actually use the default
match table at all. Sorry for not having thought of that when
I did my patch.

What kinds of devices do you see below this bus? Do you have multiple
levels of devices? Are they all platform devices or could they
be AMBA?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rt2x00pci: Disable memory-write-invalidate when the driver exits

2016-01-05 Thread Helmut Schaa
On Tue, Jan 5, 2016 at 2:27 AM, Jia-Ju Bai  wrote:
> On 01/05/2016 12:50 AM, Helmut Schaa wrote:
>>
>> On Mon, Jan 4, 2016 at 8:55 AM, Jia-Ju Bai  wrote:
>>>
>>> The driver calls pci_set_mwi to enable memory-write-invalidate when it
>>> is initialized, but does not call pci_clear_mwi when it is removed. Many
>>> other drivers calls pci_clear_mwi when pci_set_mwi is called, such as
>>> r8169, 8139cp and e1000.
>>>
>>> This patch adds pci_clear_mwi in error handling and removal procedure,
>>> which can fix the problem.
>>>
>>> Signed-off-by: Jia-Ju Bai
>>
>> Looks good to me.
>> Does this fix any actual issue?
>> If yes it might we worth to mention it in the commit message.
>> Helmut
>>
>
> Lacking pci_clear_mwi may cause a resource-release omission,
> but this omission may not cause obvious issues.
> For reliability, it is better to add pci_clear_mwi in the driver.
> Many other drivers do so, such as r8169, 8139cp and e1000.

Thanks for clarification, fine with me then.

Acked-by: Helmut Schaa 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Nokia N900: Broken lirc ir-rx51 driver

2016-01-05 Thread Pali Rohár
On Saturday 02 January 2016 09:06:57 Tony Lindgren wrote:
> Hi,
> 
> * Pali Rohár  [160102 06:46]:
> > --- a/drivers/media/rc/ir-rx51.c
> > +++ b/drivers/media/rc/ir-rx51.c
> > @@ -25,9 +25,9 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> > -#include 
> > -#include 
> > +#include "../../../arch/arm/plat-omap/include/plat/dmtimer.h"
> 
> Well we don't want to export the dmtimer functions to drivers..But
> we now have the PWM driver that can be already used for most of the
> ir-rx51.c.

Ok. Is PWM driver included in mainline kernel?

> >  #include 
> >  #include 
> > @@ -208,7 +208,7 @@ static int lirc_rx51_init_port(struct lirc_rx51 
> > *lirc_rx51)
> > }
> >  
> > clk_fclk = omap_dm_timer_get_fclk(lirc_rx51->pwm_timer);
> > -   lirc_rx51->fclk_khz = clk_fclk->rate / 1000;
> > +   lirc_rx51->fclk_khz = clk_get_rate(clk_fclk) / 1000;
> >  
> > return 0;
> >  
> > 
> > So Tony, you are author of that commit (a62a6e98c3) which broke ir-rx51
> > module for Nokia N900. Do you know how to fix this driver for upstream
> > kernel? It would be great to have driver working and not to have it in
> > this dead state...
> 
> Yup please take a look at thread "[PATCH 0/3] pwm: omap: Add PWM support
> using dual-mode timers". Chances are we still need to set up the dmtimer
> code to provide also irqchip functions. That way ir-rx51.c can just do
> request_irq on the selected dmtimer for interrupts.

No I see that patch from that thread uses dmtimer.h from plat-omap. So
it is really OK?

> > Also platform data for this driver are only in legacy board code.
> > Support in DTS is missing, so driver (after fixing above problem) cannot
> > be used on DT booted kernel.
> 
> Yeah those parts should be already doable with the PWM timer code AFAIK.
> 
> Regards,
> 
> Tony
> 
> 

-- 
Pali Rohár
pali.ro...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] BTRFS: Adds an option to select RAID Stripe size

2016-01-05 Thread David Sterba
On Thu, Dec 31, 2015 at 08:46:36AM +0800, Qu Wenruo wrote:
> > Let me note that a good reputation is also built from patch reviews
> > (hint hint).
> 
> I must admit I'm a bad reviewer.
> As when I review something, I always has an eager to rewrite part or all 
> the patch to follow my idea, even it's just a choice between different 
> design.

Yeah that's natural, but even if one does not completely agree, it's
still possible to verify that the implementation is correct.

The reviews also help to find and share some common style of
implementation so the maintainers don't scream when they see a patch and
developers and are not suprised that the patches take several rounds.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [bug] wrong result of android callchain

2016-01-05 Thread Peter Zijlstra
On Tue, Jan 05, 2016 at 05:14:37PM +0800, He Kuang wrote:
> I found a wrong result of aarch64 callchain when using perf script on
> a android phone.

Might help to include the AARGH64 people then.. seeing I have no clue
about all that. Cc's added, email preserved etc..

> 
> Here's the callchain record fragment from the output of perf script:
> 
>   init   369 [002]   339.970607: raw_syscalls:sys_enter: NR 22 (b, 
> 7fd9e360a0, 10, , 0, 8)
>  ...
>230ac [unknown] (/system/lib64/libsurfaceflinger.so)
> 11a0 main (/system/bin/surfaceflinger)
>1c3fc __libc_init (/system/lib64/libc.so)
>  fd0 _start (/system/bin/surfaceflinger)
> 29ec __dl__start (/system/bin/linker64)
> 
> The fault occured in the '[unknown]' line, from objdump result of
> /system/bin/surfaceflinger, we can see the branch instruction before
> 0x11a0:
> 
>  # objdump /system/bin/surfaceflinger
> 1198:   f9400fe0ldr x0, [sp,#24]
> 119c:   9705bl  db0 
> <_ZN7android14SurfaceFlinger3runEv@plt>
> 11a0:   f9400be8ldr x8, [sp,#16]
> 11a4:   b4c8cbz x8, 11bc 
> 
> The function '_ZN7android14SurfaceFlinger3runEv' is located at 0x3a094
> ~ 0x3a0ac in libsurfaceflinger.so, but perf misparsed that value to
> 0x230ac:
> 
>  # objdump libsurfaceflinger.so
>   0003a094 <_ZN7android14SurfaceFlinger3runEv>:
> 3a094:   a9be4ff4stp x20, x19, [sp,#-32]!
> 3a098:   a9017bfdstp x29, x30, [sp,#16]
> 3a09c:   910043fdadd x29, sp, #0x10
> 3a0a0:   910c0013add x19, x0, #0x300
> 3a0a4:   aa1303e0mov x0, x19
> 3a0a8:   97fff12fbl  36564 
> <_ZN7android12MessageQueue11waitMessageEv>
> 3a0ac:   17feb   3a0a4 
> <_ZN7android14SurfaceFlinger3runEv+0x10>
> 
> There's a difference of 0x17000 between those two offsets, it seems
> that this value is the VirtAddr of this dynamic library.
> 
>  # readelf -a libsurfaceflinger.so
>   Program Headers:
> Type   Offset VirtAddr   PhysAddr
>FileSizMemSiz  Flags  Align
> LOAD   0x 0x00017000 0x00017000
>0x00057258 0x00057258  R E1000
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] perf tools: Various fixes

2016-01-05 Thread Jiri Olsa
On Sun, Dec 27, 2015 at 12:57:16PM +0100, Jiri Olsa wrote:
> On Fri, Dec 18, 2015 at 11:06:56AM +0200, Noel Grandin wrote:
> > This series is
> > 
> > Tested-By: Noel Grandin 
> > 
> > On 2015-12-17 10:26 PM, Jiri Olsa wrote:
> > >hi,
> > >sending several changes together:
> > >   - leftover for the stat enable/disable changes with Adrian's patch
> > >   - fixes for issues Noel found with DWARF unwind
> > >
> > >Noel, could you please rerun your test on this?
> > >
> > >Also available in:
> > >   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> > >   perf/fixes
> 
> I updated perf/fixes branch woth Noel's tag

perf/fixes rebased to latest acme/perf/core

FYI it contains couple more small fixes on top, which I'll send out later

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mmc: host: arasan: sdhci-of-arasan: Remove no-hispd and no-cmd23 quirks for sdhci-arasan4.9a

2016-01-05 Thread Rameshwar Sahu
Hi Ulf,

On Wed, Dec 23, 2015 at 6:59 PM, Rameshswar Prasad Sahu  wrote:
> From: Rameshwar Prasad Sahu 
>
> The Arason SD host controller supports set block count command (cmd23)
> and high speed mode. This patch re-enable both of these features that
> was disabled. For device that doesn't support high speed, it should
> configure its capability register accordingly instead disables it
> explicitly.
>
> Signed-off-by: Rameshwar Prasad Sahu 
> ---
>  drivers/mmc/host/sdhci-of-arasan.c |5 -
>  1 files changed, 0 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/mmc/host/sdhci-of-arasan.c 
> b/drivers/mmc/host/sdhci-of-arasan.c
> index 75379cb..5d9fdb3 100644
> --- a/drivers/mmc/host/sdhci-of-arasan.c
> +++ b/drivers/mmc/host/sdhci-of-arasan.c
> @@ -172,11 +172,6 @@ static int sdhci_arasan_probe(struct platform_device 
> *pdev)
> goto clk_disable_all;
> }
>
> -   if (of_device_is_compatible(pdev->dev.of_node, "arasan,sdhci-4.9a")) {
> -   host->quirks |= SDHCI_QUIRK_NO_HISPD_BIT;
> -   host->quirks2 |= SDHCI_QUIRK2_HOST_NO_CMD23;
> -   }
> -
> sdhci_get_of_property(pdev);
> pltfm_host = sdhci_priv(host);
> pltfm_host->priv = sdhci_arasan;
> --
> 1.7.1
>

Any comment on this patch ??
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/3] clocksource/vt8500: Increase the minimum delta

2016-01-05 Thread Daniel Lezcano

On 01/05/2016 11:00 AM, Russell King - ARM Linux wrote:

On Tue, Jan 05, 2016 at 12:42:42PM +0300, Roman Volkov wrote:

Why multiply by two? Good question. Maybe there is a reserve for
stability. The value passed by the system to the set_next_event() should
be not lesser than this value, and theoretically, we should not
multiply MIN_OSCR_DELTA by two. As I can see, in many drivers there is
no such minimal values at all.


It's a speciality of the StrongARM/PXA hardware.  It takes a certain
number of OSCR cycles for the value written to hit the compare registers.
So, if a very small delta is written (eg, the compare register is written
with a value of OSCR + 1), the OSCR will have incremented past this value
before it hits the underlying hardware.  The result is, that you end up
waiting a very long time for the OSCR to wrap before the event fires.

So, we introduce a check in set_next_event() to detect this and return
-ETIME if the calculated delta is too small, which causes the generic
clockevents code to retry after adding the min_delta specified in
clockevents_config_and_register() to the current time value.

min_delta must be sufficient that we don't re-trip the -ETIME check - if
we do, we will return -ETIME, forward the next event time, try to set it,
return -ETIME again, and basically lock the system up.  So, min_delta
must be larger than the check inside set_next_event().  A factor of two
was chosen to ensure that this situation would never occur.


Russell,

thank you for taking the time to write this detailed explanation. I 
believe that clarifies everything (the issue with the lockup and the 
value of the min delta).


Roman,

If we are in the situation Russell is describing above, failing 
gracefully as mentioned before does not make sense.


Do you have a idea why this is happening with 4.2 and not before ?


The PXA code worked on PXA systems for years, and I'd suggest no one
changes this mechanism without access to a wide range of PXA systems,
otherwise they're risking breakage.


Copy that :)


--
  Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 2/2] PM / OPP: Parse 'opp--' bindings

2016-01-05 Thread Geert Uytterhoeven
Hi Viresh,

On Wed, Dec 9, 2015 at 3:31 AM, Viresh Kumar  wrote:
> OPP bindings (for few properties) allow a platform to choose a
> value/range among a set of available options. The options are present as
> opp--, where the platform needs to supply the  string.
>
> The OPP properties which allow such an option are: opp-microvolt and
> opp-microamp.
>
> Add support to the OPP-core to parse these bindings, by introducing
> dev_pm_opp_{set|put}_prop_name() APIs.
>
> Signed-off-by: Viresh Kumar 

> @@ -794,35 +797,48 @@ static int _opp_add_v1(struct device *dev, unsigned 
> long freq, long u_volt,
>  }
>
>  /* TODO: Support multiple regulators */
> -static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev)
> +static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
> + struct device_opp *dev_opp)
>  {
> u32 microvolt[3] = {0};
> u32 val;
> int count, ret;
> +   struct property *prop = NULL;
> +   char name[NAME_MAX];
> +
> +   /* Search for "opp-microvolt-" */
> +   if (dev_opp->prop_name) {
> +   sprintf(name, "opp-microvolt-%s", dev_opp->prop_name);

Any chance an attacker can overflow name[] by providing a very long
dev_opp->prop_name?

Better safe than sorry:

snprintf(name, sizeof(name), ...);

> +   prop = of_find_property(opp->np, name, NULL);
> +   }

> @@ -830,7 +846,20 @@ static int opp_parse_supplies(struct dev_pm_opp *opp, 
> struct device *dev)
> opp->u_volt_min = microvolt[1];
> opp->u_volt_max = microvolt[2];
>
> -   if (!of_property_read_u32(opp->np, "opp-microamp", &val))
> +   /* Search for "opp-microamp-" */
> +   prop = NULL;
> +   if (dev_opp->prop_name) {
> +   sprintf(name, "opp-microamp-%s", dev_opp->prop_name);

Likewise

> +   prop = of_find_property(opp->np, name, NULL);
> +   }

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] BTRFS: Adds an option to select RAID Stripe size

2016-01-05 Thread David Sterba
On Wed, Dec 30, 2015 at 04:21:47PM -0500, Sanidhya Solanki wrote:
> On Wed, 30 Dec 2015 17:17:22 +0100
> David Sterba  wrote:
> 
> > Let me note that a good reputation is also built from patch reviews
> > (hint hint).
> 
> Unfortunately, not too many patches coming in for BTRFS presently.
> Mailing list activity is down to 25-35 mails per day. Mostly feature
> and bug requests.
> 
> I will try to pitch in with patch reviews where possible.

It was not meant specifically to you, but I won't discourage you from
doing reviews of course. The period where a review is expected can vary
and is bound to the development cycle of kernel. At the latest, they
should come before the integration branch is put togheter (before the
merge window), and for the rc's it's before the next schedule (less than
a week).

The reviewed-by tag has a real meaning and weight in the community

http://lxr.free-electrons.com/source/Documentation/SubmittingPatches#L552

and besides that, subscribes the person to the blame game and can cause
bad feelings if the code turns out to be buggy later on.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > >> The two mechanisms referenced above would likely require coordination 
> > > >> with
> > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > >> address
> > > >> them as I am not sure there is a consensus as of yet.  My personal
> > > >> preference would be to add a vendor-specific configuration block to the
> > > >> emulated pci-bridge interfaces created by QEMU that would allow us to
> > > >> essentially extend shpc to support guest live migration with 
> > > >> pass-through
> > > >> devices.
> > > >
> > > > shpc?
> > > 
> > > That is kind of what I was thinking.  We basically need some mechanism
> > > to allow for the host to ask the device to quiesce.  It has been
> > > proposed to possibly even look at something like an ACPI interface
> > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > case.
> > > 
> > > - Alex
> > 
> > 
> > Start by using hot-unplug for this!
> > 
> > Really use your patch guest side, and write host side
> > to allow starting migration with the device, but
> > defer completing it.
> > 
> > So
> > 
> > 1.- host tells guest to start tracking memory writes
> > 2.- guest acks
> > 3.- migration starts
> > 4.- most memory is migrated
> > 5.- host tells guest to eject device
> > 6.- guest acks
> > 7.- stop vm and migrate rest of state
> > 
> > 
> > It will already be a win since hot unplug after migration starts and
> > most memory has been migrated is better than hot unplug before migration
> > starts.
> > 
> > Then measure downtime and profile. Then we can look at ways
> > to quiesce device faster which really means step 5 is replaced
> > with "host tells guest to quiesce device and dirty (or just unmap!)
> > all memory mapped for write by device".
> 
> 
> Doing a hot-unplug is going to upset the guests network stacks view
> of the world; that's something we don't want to change.
> 
> Dave

It might but if you store the IP and restore it quickly
after migration e.g. using guest agent, as opposed to DHCP,
then it won't.

It allows calming the device down in a generic way,
specific drivers can then implement the fast quiesce.

> > 
> > -- 
> > MST
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 05/10] perf tools: Add dynamic sort key for tracepoint events

2016-01-05 Thread Jiri Olsa
On Mon, Dec 21, 2015 at 11:26:48PM +0900, Namhyung Kim wrote:

SNIP

> + free(str);
> + return ret;
> +}
> +
>  static int __sort_dimension__add(struct sort_dimension *sd)
>  {
>   if (sd->taken)
> @@ -1667,6 +1887,9 @@ static int sort_dimension__add(const char *tok,

sort_dimension__add's evlist arg could loose the '__maybe_unused' now

thanks,
jirka

>   return 0;
>   }
>  
> + if (!add_dynamic_entry(evlist, tok))
> + return 0;
> +
>   return -ESRCH;
>  }
>  
> -- 
> 2.6.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/PATCH] perf report: Show random usage tip on the help line

2016-01-05 Thread Namhyung Kim
Hi,

On Tue, Jan 05, 2016 at 02:32:47PM +0800, Wangnan (F) wrote:
> 
> 
> On 2016/1/5 13:36, Namhyung Kim wrote:
> >Currently perf report only shows a help message "For a higher level
> >overview, try: perf report --sort comm,dso" unconditionally (even if
> >the sort keys were used).  Add more help tips and show randomly.
> >
> >Signed-off-by: Namhyung Kim 
> >---
> 
> That's really funny.

Thanks for your feedback!

> 
> Some inconvenience:
> 
>  1. Tip is never change during one execution of 'perf report', even if
> I switch to another view using 'enter' and switch back. It should better
> if tips updated when redrawing.

Hmm.. I think it's a preference.  I'd go for simplicity then. :)


> 
>  2. I think add a "Tip: " prefix to the content should be better, or users
> may confuse what he/her doing causes this message

OK.

> 
>  3. What about creating a tools/perf/Documentation/tips.txt and generate
> tips table dynamically?

I don't see much difference doing that.  I guess most of users don't
want to go to see the documentation anyway.  Do I miss something?

Btw, does anyone have some tips to add? :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/3] x86: Add support for guest DMA dirty page tracking

2016-01-05 Thread Dr. David Alan Gilbert
* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Tue, Jan 05, 2016 at 10:01:04AM +, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > On Mon, Jan 04, 2016 at 07:11:25PM -0800, Alexander Duyck wrote:
> > > > >> The two mechanisms referenced above would likely require 
> > > > >> coordination with
> > > > >> QEMU and as such are open to discussion.  I haven't attempted to 
> > > > >> address
> > > > >> them as I am not sure there is a consensus as of yet.  My personal
> > > > >> preference would be to add a vendor-specific configuration block to 
> > > > >> the
> > > > >> emulated pci-bridge interfaces created by QEMU that would allow us to
> > > > >> essentially extend shpc to support guest live migration with 
> > > > >> pass-through
> > > > >> devices.
> > > > >
> > > > > shpc?
> > > > 
> > > > That is kind of what I was thinking.  We basically need some mechanism
> > > > to allow for the host to ask the device to quiesce.  It has been
> > > > proposed to possibly even look at something like an ACPI interface
> > > > since I know ACPI is used by QEMU to manage hot-plug in the standard
> > > > case.
> > > > 
> > > > - Alex
> > > 
> > > 
> > > Start by using hot-unplug for this!
> > > 
> > > Really use your patch guest side, and write host side
> > > to allow starting migration with the device, but
> > > defer completing it.
> > > 
> > > So
> > > 
> > > 1.- host tells guest to start tracking memory writes
> > > 2.- guest acks
> > > 3.- migration starts
> > > 4.- most memory is migrated
> > > 5.- host tells guest to eject device
> > > 6.- guest acks
> > > 7.- stop vm and migrate rest of state
> > > 
> > > 
> > > It will already be a win since hot unplug after migration starts and
> > > most memory has been migrated is better than hot unplug before migration
> > > starts.
> > > 
> > > Then measure downtime and profile. Then we can look at ways
> > > to quiesce device faster which really means step 5 is replaced
> > > with "host tells guest to quiesce device and dirty (or just unmap!)
> > > all memory mapped for write by device".
> > 
> > 
> > Doing a hot-unplug is going to upset the guests network stacks view
> > of the world; that's something we don't want to change.
> > 
> > Dave
> 
> It might but if you store the IP and restore it quickly
> after migration e.g. using guest agent, as opposed to DHCP,
> then it won't.

I thought if you hot-unplug then it will lose any outstanding connections
on that device.

> It allows calming the device down in a generic way,
> specific drivers can then implement the fast quiesce.

Except that if it breaks the guest networking it's useless.

Dave

> 
> > > 
> > > -- 
> > > MST
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >