date:20140722

RE: [PATCH 2/6] ARM: imx: clk-vf610: add USBPHY clocks

2014-07-22 Thread Jingchang Lu


>-Original Message-
>From: Shawn Guo [mailto:shawn@freescale.com]
>Sent: Tuesday, July 22, 2014 10:32 AM
>To: Stefan Agner; Lu Jingchang-B35083
>Cc: Chen Peter-B29397; s.ha...@pengutronix.de; linux-arm-
>ker...@lists.infradead.org; linux-...@vger.kernel.org; linux-
>ker...@vger.kernel.org
>Subject: Re: [PATCH 2/6] ARM: imx: clk-vf610: add USBPHY clocks
>
>On Fri, Jul 18, 2014 at 07:01:38PM +0200, Stefan Agner wrote:
>> This commit adds PLL7 which is required for USBPHY1. It also adds the
>> USB PHY and USB Controller clocks and the gates to enable them.
>>
>> Signed-off-by: Stefan Agner 
>
>Jingchang,
>
>Does the patch look good to you?
>
>Shawn
>
For the clk creation, I think it is ok if the functionality has been tested, 
thanks!

Acked-by: Jingchang Lu

[PATCH] Staging: gdm724x: gdm_lte.c: Fix warning of prefer ether_addr_copy()

2014-07-22 Thread Kiran Padwal

This patch fixes the following checkpatch.pl warnings:
WARNING: "Prefer ether_addr_copy() over memcpy() if the Ethernet addresses are 
__aligned(2)".

Signed-off-by: Kiran Padwal 
---
 drivers/staging/gdm724x/gdm_lte.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/gdm724x/gdm_lte.c 
b/drivers/staging/gdm724x/gdm_lte.c
index bc6d574..6df6c70 100644
--- a/drivers/staging/gdm724x/gdm_lte.c
+++ b/drivers/staging/gdm724x/gdm_lte.c
@@ -626,7 +626,7 @@ static void gdm_lte_netif_rx(struct net_device *dev, char 
*buf,
void *addr = buf + sizeof(struct iphdr) +
sizeof(struct udphdr) +
offsetof(struct dhcp_packet, chaddr);
-   memcpy(nic->dest_mac_addr, addr, ETH_ALEN);
+   ether_addr_copy(nic->dest_mac_addr, addr);
}
}
 
@@ -639,7 +639,7 @@ static void gdm_lte_netif_rx(struct net_device *dev, char 
*buf,
}
 
/* Format the data so that it can be put to skb */
-   memcpy(mac_header_data, nic->dest_mac_addr, ETH_ALEN);
+   ether_addr_copy(mac_header_data, nic->dest_mac_addr);
memcpy(mac_header_data + ETH_ALEN, nic->src_mac_addr, ETH_ALEN);
 
vlan_eth.h_vlan_TCI = htons(nic->vlan_id);
@@ -842,9 +842,9 @@ static void form_mac_address(u8 *dev_addr, u8 *nic_src, u8 
*nic_dest,
 {
/* Form the dev_addr */
if (!mac_address)
-   memcpy(dev_addr, gdm_lte_macaddr, ETH_ALEN);
+   ether_addr_copy(dev_addr, gdm_lte_macaddr);
else
-   memcpy(dev_addr, mac_address, ETH_ALEN);
+   ether_addr_copy(dev_addr, mac_address);
 
/* The last byte of the mac address
 * should be less than or equal to 0xFC
@@ -858,7 +858,7 @@ static void form_mac_address(u8 *dev_addr, u8 *nic_src, u8 
*nic_dest,
memcpy(nic_src, dev_addr, 3);
 
/* Copy the nic_dest from dev_addr*/
-   memcpy(nic_dest, dev_addr, ETH_ALEN);
+   ether_addr_copy(nic_dest, dev_addr);
 }
 
 static void validate_mac_address(u8 *mac_address)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] gpiolib: Export gpiochip_request_own_desc and gpiochip_free_own_desc

2014-07-22 Thread Mika Westerberg

On Mon, Jul 21, 2014 at 11:53:27PM -0700, Guenter Roeck wrote:
> On 07/21/2014 11:43 PM, Mika Westerberg wrote:
> >On Mon, Jul 21, 2014 at 10:18:25PM -0700, Guenter Roeck wrote:
> >>Both functions were introduced to let gpio drivers request their own
> >>gpio pins. Without exporting the functions, this can however only be
> >>used by gpio drivers built into the kernel.
> >
> >The reason why these are private to drivers is that those are dangerous
> >if used blindly.
> >
> >>Secondary impact is that the functions can not currently be used by
> >>platform initialization code associated with the gpio-pca953x driver.
> >>This code permits auto-export of gpio pins through platform data, but
> >>if this functionality is used, the module can no longer be unloaded due
> >>to the problem solved with the introduction of gpiochip_request_own_desc
> >>and gpiochip_free_own_desc.
> >>
> >>Export both function so they can be used from modules and from
> >>platform initialization code.
> >
> >However, you have valid reason above. I wonder if this requires
> >some documentation in Documentation/gpio/driver.txt?
> >
> Sure. Any idea what I should write, or do you want me to come up with 
> something ?

Come up with something that lists valid usage of these two functions :)

Mainly that they are supposed to be used by GPIO chip drivers to request
their own GPIOs, not consumer drivers etc. And even in GPIO chip drivers
they should be used sparingly.

> 
> >>Cc: Mika Westerberg 
> >>Signed-off-by: Guenter Roeck 
> >>---
> >>v2: Move function declarations from consumer.h to driver.h.
> >>
> >>  drivers/gpio/gpiolib.c  | 2 ++
> >>  drivers/gpio/gpiolib.h  | 3 ---
> >>  include/linux/gpio/driver.h | 3 +++
> >>  3 files changed, 5 insertions(+), 3 deletions(-)
> >>
> >>diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> >>index 43d9e34..04c647e 100644
> >>--- a/drivers/gpio/gpiolib.c
> >>+++ b/drivers/gpio/gpiolib.c
> >>@@ -1953,6 +1953,7 @@ int gpiochip_request_own_desc(struct gpio_desc *desc, 
> >>const char *label)
> >>
> >>return __gpiod_request(desc, label);
> >>  }
> >>+EXPORT_SYMBOL(gpiochip_request_own_desc);
> >
> >EXPORT_SYMBOL_GPL?
> >
> Ok.
> 
> Thanks,
> Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] gpio: Add support for GPIOF_ACTIVE_LOW to gpio_request_one

2014-07-22 Thread Alexandre Courbot

On Tue, Jul 22, 2014 at 2:16 PM, Guenter Roeck  wrote:
> On 07/21/2014 09:30 PM, Alexandre Courbot wrote:
>>
>> On Thu, Jul 17, 2014 at 11:34 PM, Guenter Roeck 
>> wrote:
>>>
>>> On 07/17/2014 12:26 AM, Alexandre Courbot wrote:


 On Thu, Jul 17, 2014 at 3:37 PM, Guenter Roeck 
 wrote:
>
>
> On 07/16/2014 11:09 PM, Alexandre Courbot wrote:
>>
>>
>>
>> On Thu, Jul 17, 2014 at 8:11 AM, Guenter Roeck 
>> wrote:
>>>
>>>
>>>
>>> The gpio include file and the gpio documentation declare and document
>>> GPIOF_ACTIVE_LOW as one of the flags to be passed to gpio_request_one
>>> and related functions. However, the flag is not evaluated or used.
>>>
>>> Check the flag in gpio_request_one and set the gpio internal flag
>>> FLAG_ACTIVE_LOW if it is set.
>>
>>
>>
>>
>> What is the point since the integer GPIO API has no clue of the
>> active-low status of a GPIO? It is only used by the gpiod and sysfs
>> interfaces.
>>
>
> One can use gpio_request_one() to export a gpio pin to user space from
> the kernel. That code path does use the flag, as you point out yourself
> above.



 Ok, in that case I suppose it makes sense.

 Reviewed-by: Alexandre Courbot 

> One could also argue that the integer gpio API _should_ support this as
> well,
> but that is a different question.



 Probably not going to happen. The integer GPIO interface is deprecated
 and users who need new features should seriously consider switching to
 gpiod.

>>>
>>> The new API is unfortunately not equivalent to the old one. For example,
>>> if I understand correctly, gpiod_get is expected to be used instead of
>>> gpio_request_one.
>>
>>
>> That is correct. The reason is to have a separate authority to assigns
>> GPIOs and prevent drivers from arbitrarily requesting any GPIO they
>> want, as long as everybody sticks to the gpiod interface.
>>
>>> That may work nicely in a world with full DT or ACPI
>>> support, but doesn't work as well otherwise unless one drops the notion
>>> of using platform specific drivers built as modules
>>> (gpiod_add_lookup_table
>>> is not exported, and there is no remove function).
>>>
>>> Specifically, I don't see an easy way to convert mdio-gpio to use the new
>>> model, and that driver could really use support for an API which supports
>>> active-low pins.
>>
>>
>> If you want to benefit from the active-low property but cannot use
>> gpiod_get() for some reason, you can still request a GPIO by
>> gpio_request_one() and then convert it to a descriptor with
>> gpio_to_desc(), which is what I suspect your patch will allow you to
>> do. But the real fix would be to work any limitations that gpiod has
>> that prevent you from doing what you need.
>>
>> I am not familiar with mdio-gpio - could you explain what makes it
>> impossible to convert this driver to the new model in your view?
>>
>
> gpiod_get has the notion of 'con_id'. That is all fine if you have
> a system with acpi or devicetree to set this up, but the platform
> data fallback is pretty clumsy (at least in my opinion).

GPIO lookup tables for platform data are still young, and updating it
to something better is still possible if you have ideas on how to
improve it. Actually I am thinking about giving them a second thought
myself.

> To convert
> mdio-gpio such that it can provide the required platform data to gpio
> would complicate the code so much that the conversion would not be worth
> it.

Here "the code" would only consist of platform-dependant lookup
tables, right? Or do you need to build GPIO lookup tables dynamically?

>
> It would also require to convert all drivers _using_ mdio-gpio to the
> new platform data format required by gpiod_get, or the mdio-gpio driver
> would have to do some kind of internal conversion, both of which would
> make the code even more complicated (and less likely to get it accepted).

Maybe not necessarily. The gpiod interface has been designed so that
most existing DT bindings can use it without having to change. For
instance, in Documentation/devicetree/bindings/net/mdio-gpio.txt I see
that GPIOs are assigned that way:

gpios = <&qe_pio_a 11
&qe_pio_c 6>;

You can get such GPIOs by omitting the con_id argument in
gpiod_get_index(), e.g. gpiod_get_index(dev, NULL, 1) will return the
second GPIO of the array.

All the same you can define lookup tables only containing GPIOs
without a con_id. They will match the same call.

>
> My current solution is exactly what you suggested - to use
> gpio_request_one()
> followed by gpio_to_desc(). I plan to submit the necessary patches
> once this patch has been accepted.

Not ideal, but acceptable as a temporary solution until we find a
better way for you to use gpiod. It's better to use it halfway than
not at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: [PATCH] powerpc: Revert removing of _INIT_GLOBAL(), _STATIC() and _INIT_STATIC()

2014-07-22 Thread Michael Ellerman

On Tue, 2014-07-22 at 16:13 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2014-07-21 at 14:56 -0400, Steven Rostedt wrote:
> > > Weird ... what are your gcc and binutils versions ? Smells like a
> > > toolchain issue to me but I need to dig a bit more. Doesn't hit any
> > > of my test configs here.
> > > 
> > 
> > Can you test the attached config with this toolchain and see if you get
> > the same build bug.
> 
> Yeah ok, I hit it too, it's one of those issues with the bloody
> alternate feature sections getting pushed too far away. Argh.

This should fix it temporarily at least:

  http://patchwork.ozlabs.org/patch/369950/

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] checkpatch: Fix false positive MISSING_BREAK warnings with --file

2014-07-22 Thread Lee Jones

On Mon, 21 Jul 2014, Joe Perches wrote:
> Using --file mode can give false positives with MISSING_BREAK
> fall-through warnings on simple but long multiple consecutive
> case statements.
> 
> Look for all lines before a case statement for a switch
> or a statement when using --file mode.
> 
> Fix a misspelling of preceded while there.
> 
> Signed-off-by: Joe Perches 
> Reported-by: Lee Jones 

Solution works for me.

> total: 0 errors, 0 warnings, 574 lines checked
> 
> drivers/mfd/tps80031.c has no obvious style problems and is ready
> for submission.

Acked-by: Lee Jones 

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Staging: gdm724x: gdm_usb.c: fix missing blank line after variable declaration

2014-07-22 Thread Kiran Padwal

Checkpatch fix - Add missing blank line after variable declaration

Signed-off-by: Kiran Padwal 
---
 drivers/staging/gdm724x/gdm_usb.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/gdm724x/gdm_usb.c 
b/drivers/staging/gdm724x/gdm_usb.c
index ea89d53..0c1b2de 100644
--- a/drivers/staging/gdm724x/gdm_usb.c
+++ b/drivers/staging/gdm724x/gdm_usb.c
@@ -896,6 +896,7 @@ static void gdm_usb_disconnect(struct usb_interface *intf)
struct lte_udev *udev;
u16 idVendor, idProduct;
struct usb_device *usbdev;
+
usbdev = interface_to_usbdev(intf);
 
idVendor = __le16_to_cpu(usbdev->descriptor.idVendor);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sched: fix llc shared map unreleased during cpu hotplug

2014-07-22 Thread Wanpeng Li

[  220.262093] BUG: unable to handle kernel NULL pointer dereference at 
0004
[  220.262104] IP: [] find_busiest_group+0x2b9/0xa30
[  220.262111] PGD 5a9d5067 PUD 13067 PMD 0
[  220.262117] Oops:  [#3] SMP
[...]
[  220.262245] Call Trace:
[  220.262252]  [] load_balance+0x156/0x980
[  220.262259]  [] ? _raw_spin_unlock_irqrestore+0x2e/0xa0
[  220.262266]  [] idle_balance+0xe3/0x150
[  220.262270]  [] __schedule+0x797/0x8d0
[  220.262277]  [] schedule+0x24/0x70
[  220.262283]  [] schedule_timeout+0x119/0x1f0
[  220.262294]  [] ? lock_timer_base+0x70/0x70
[  220.262301]  [] schedule_timeout_uninterruptible+0x19/0x20
[  220.262308]  [] msleep+0x18/0x20
[  220.262317]  [] lock_device_hotplug_sysfs+0x2a/0x50
[  220.262323]  [] online_store+0x2e/0x80
[  220.262358]  [] dev_attr_store+0x1b/0x20
[  220.262366]  [] sysfs_write_file+0xdd/0x160
[  220.262377]  [] vfs_write+0xc8/0x170
[  220.262384]  [] SyS_write+0x5a/0xa0
[  220.262388]  [] system_call_fastpath+0x16/0x1b

Last level cache shared map is built during cpu up and build sched domain 
routine takes advantage of it to setup sched domain cpu topology, however, 
llc shared map is unreleased during cpu disable which lead to invalid sched 
domain cpu topology. This patch fix it by release llc shared map correctly
during cpu disable.

Signed-off-by: Wanpeng Li 
---
 arch/x86/kernel/smpboot.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 5492798..0134ec7 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1292,6 +1292,9 @@ static void remove_siblinginfo(int cpu)
 
for_each_cpu(sibling, cpu_sibling_mask(cpu))
cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
+   for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
+   cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling));
+   cpumask_clear(cpu_llc_shared_mask(cpu));
cpumask_clear(cpu_sibling_mask(cpu));
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mmc: core: Remove fixed voltage regulator logic

2014-07-22 Thread Tim Kryger

There is no need for regulator consumers to include special logic for
fixed voltage regulators as they support regulator_set_voltage() just
like their non-fixed regulator counterparts.

Signed-off-by: Tim Kryger 
---

Since this eliminates logic that was concealing a bug in how the SDHCI
driver was setting ocr_avail, it is important that the following patch
to fix that bug be taken first.  Fortunately, it is already queued.

  https://lkml.org/lkml/2014/6/13/451

 drivers/mmc/core/core.c |8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 7dc0c85..e56375c 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -1280,15 +1280,7 @@ int mmc_regulator_set_ocr(struct mmc_host *mmc,
max_uV = min_uV + 100 * 1000;
}
 
-   /*
-* If we're using a fixed/static regulator, don't call
-* regulator_set_voltage; it would fail.
-*/
voltage = regulator_get_voltage(supply);
-
-   if (!regulator_can_change_voltage(supply))
-   min_uV = max_uV = voltage;
-
if (voltage < 0)
result = voltage;
else if (voltage < min_uV || voltage > max_uV)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/5] gpio: simplify gpiochip_export()

2014-07-22 Thread Alexandre Courbot

For some reason gpiochip_export() would invalidate all the descriptors
of a chip if exporting it to sysfs failed. This does not appear as
necessary. Remove that part of the code.

While we are at it, add a note about the non-safety of temporarily
releasing a spinlock in the middle of the loop that protects its
iterator, and explain why this is done.

Signed-off-by: Alexandre Courbot 
---
 drivers/gpio/gpiolib-sysfs.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/gpio/gpiolib-sysfs.c b/drivers/gpio/gpiolib-sysfs.c
index 3516502059f2..f150aa288fa1 100644
--- a/drivers/gpio/gpiolib-sysfs.c
+++ b/drivers/gpio/gpiolib-sysfs.c
@@ -760,18 +760,8 @@ int gpiochip_export(struct gpio_chip *chip)
chip->exported = (status == 0);
mutex_unlock(&sysfs_lock);
 
-   if (status) {
-   unsigned long   flags;
-   unsignedgpio;
-
-   spin_lock_irqsave(&gpio_lock, flags);
-   gpio = 0;
-   while (gpio < chip->ngpio)
-   chip->desc[gpio++].chip = NULL;
-   spin_unlock_irqrestore(&gpio_lock, flags);
-
+   if (status)
chip_dbg(chip, "%s: status %d\n", __func__, status);
-   }
 
return status;
 }
@@ -817,6 +807,14 @@ static int __init gpiolib_sysfs_init(void)
if (!chip || chip->exported)
continue;
 
+   /*
+* TODO we yield gpio_lock here because gpiochip_export()
+* acquires a mutex. This is unsafe and needs to be fixed.
+*
+* Also it would be nice to use gpiochip_find() here so we
+* can keep gpio_chips local to gpiolib.c, but the yield of
+* gpio_lock prevents us from doing this.
+*/
spin_unlock_irqrestore(&gpio_lock, flags);
status = gpiochip_export(chip);
spin_lock_irqsave(&gpio_lock, flags);
-- 
2.0.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/5] gpio: make gpiochip_get_desc() gpiolib-private

2014-07-22 Thread Alexandre Courbot

As GPIO descriptors are not going to remain unique anymore, having this
function public is not safe. Restrain its use to gpiolib since we have
no user outside of it.

Signed-off-by: Alexandre Courbot 
---
 drivers/gpio/gpiolib-of.c   | 2 +-
 drivers/gpio/gpiolib.c  | 1 -
 drivers/gpio/gpiolib.h  | 2 ++
 include/linux/gpio/driver.h | 3 ---
 4 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
index 3e2fae205bee..7cfdc2278905 100644
--- a/drivers/gpio/gpiolib-of.c
+++ b/drivers/gpio/gpiolib-of.c
@@ -23,7 +23,7 @@
 #include 
 #include 
 
-struct gpio_desc;
+#include "gpiolib.h"
 
 /* Private data structure for of_gpiochip_find_and_xlate */
 struct gg_data {
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index c5509359ba88..38d176e31379 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -82,7 +82,6 @@ struct gpio_desc *gpiochip_get_desc(struct gpio_chip *chip,
 
return &chip->desc[hwnum];
 }
-EXPORT_SYMBOL_GPL(gpiochip_get_desc);
 
 /**
  * Convert a GPIO descriptor to the integer namespace.
diff --git a/drivers/gpio/gpiolib.h b/drivers/gpio/gpiolib.h
index 98020c393eb3..acbb9335f08c 100644
--- a/drivers/gpio/gpiolib.h
+++ b/drivers/gpio/gpiolib.h
@@ -51,6 +51,8 @@ void gpiochip_free_own_desc(struct gpio_desc *desc);
 struct gpio_desc *of_get_named_gpiod_flags(struct device_node *np,
   const char *list_name, int index, enum of_gpio_flags *flags);
 
+struct gpio_desc *gpiochip_get_desc(struct gpio_chip *chip, u16 hwnum);
+
 extern struct spinlock gpio_lock;
 extern struct list_head gpio_chips;
 
diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h
index 573e4f3243d0..4dc79714c136 100644
--- a/include/linux/gpio/driver.h
+++ b/include/linux/gpio/driver.h
@@ -151,9 +151,6 @@ void gpiod_unlock_as_irq(struct gpio_desc *desc);
 
 struct gpio_chip *gpiod_to_chip(const struct gpio_desc *desc);
 
-struct gpio_desc *gpiochip_get_desc(struct gpio_chip *chip,
-   u16 hwnum);
-
 enum gpio_lookup_flags {
GPIO_ACTIVE_HIGH = (0 << 0),
GPIO_ACTIVE_LOW = (1 << 0),
-- 
2.0.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/5] gpio: remove gpiod_lock/unlock_as_irq()

2014-07-22 Thread Alexandre Courbot

gpio_lock/unlock_as_irq() are working with (chip, offset) arguments and
are thus not using the old integer namespace. Therefore, there is no
reason to have gpiod variants of these functions working with
descriptors, especially since the (chip, offset) tuple is more suitable
to the users of these functions (GPIO drivers, whereas GPIO descriptors
are targeted at GPIO consumers).

Signed-off-by: Alexandre Courbot 
---
 Documentation/gpio/driver.txt |  4 ++--
 drivers/gpio/gpiolib-acpi.c   |  6 +++---
 drivers/gpio/gpiolib-legacy.c | 12 
 drivers/gpio/gpiolib-sysfs.c  |  4 ++--
 drivers/gpio/gpiolib.c| 30 --
 include/asm-generic/gpio.h|  3 ---
 include/linux/gpio/driver.h   |  4 ++--
 7 files changed, 25 insertions(+), 38 deletions(-)

diff --git a/Documentation/gpio/driver.txt b/Documentation/gpio/driver.txt
index fa9a0a8b3734..224dbbcd1804 100644
--- a/Documentation/gpio/driver.txt
+++ b/Documentation/gpio/driver.txt
@@ -157,12 +157,12 @@ Locking IRQ usage
 Input GPIOs can be used as IRQ signals. When this happens, a driver is 
requested
 to mark the GPIO as being used as an IRQ:
 
-   int gpiod_lock_as_irq(struct gpio_desc *desc)
+   int gpio_lock_as_irq(struct gpio_chip *chip, unsigned int offset)
 
 This will prevent the use of non-irq related GPIO APIs until the GPIO IRQ lock
 is released:
 
-   void gpiod_unlock_as_irq(struct gpio_desc *desc)
+   void gpio_unlock_as_irq(struct gpio_chip *chip, unsigned int offset)
 
 When implementing an irqchip inside a GPIO driver, these two functions should
 typically be called in the .startup() and .shutdown() callbacks from the
diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c
index 4a987917c186..d2e8600df02c 100644
--- a/drivers/gpio/gpiolib-acpi.c
+++ b/drivers/gpio/gpiolib-acpi.c
@@ -157,7 +157,7 @@ static acpi_status acpi_gpiochip_request_interrupt(struct 
acpi_resource *ares,
 
gpiod_direction_input(desc);
 
-   ret = gpiod_lock_as_irq(desc);
+   ret = gpio_lock_as_irq(chip, pin);
if (ret) {
dev_err(chip->dev, "Failed to lock GPIO as interrupt\n");
goto fail_free_desc;
@@ -212,7 +212,7 @@ static acpi_status acpi_gpiochip_request_interrupt(struct 
acpi_resource *ares,
 fail_free_event:
kfree(event);
 fail_unlock_irq:
-   gpiod_unlock_as_irq(desc);
+   gpio_unlock_as_irq(chip, pin);
 fail_free_desc:
gpiochip_free_own_desc(desc);
 
@@ -263,7 +263,7 @@ static void acpi_gpiochip_free_interrupts(struct 
acpi_gpio_chip *acpi_gpio)
desc = gpiochip_get_desc(chip, event->pin);
if (WARN_ON(IS_ERR(desc)))
continue;
-   gpiod_unlock_as_irq(desc);
+   gpio_unlock_as_irq(chip, event->pin);
gpiochip_free_own_desc(desc);
list_del(&event->node);
kfree(event);
diff --git a/drivers/gpio/gpiolib-legacy.c b/drivers/gpio/gpiolib-legacy.c
index eb5a4e2cee85..c3e3823a40b9 100644
--- a/drivers/gpio/gpiolib-legacy.c
+++ b/drivers/gpio/gpiolib-legacy.c
@@ -97,15 +97,3 @@ void gpio_free_array(const struct gpio *array, size_t num)
gpio_free((array++)->gpio);
 }
 EXPORT_SYMBOL_GPL(gpio_free_array);
-
-int gpio_lock_as_irq(struct gpio_chip *chip, unsigned int offset)
-{
-   return gpiod_lock_as_irq(gpiochip_get_desc(chip, offset));
-}
-EXPORT_SYMBOL_GPL(gpio_lock_as_irq);
-
-void gpio_unlock_as_irq(struct gpio_chip *chip, unsigned int offset)
-{
-   return gpiod_unlock_as_irq(gpiochip_get_desc(chip, offset));
-}
-EXPORT_SYMBOL_GPL(gpio_unlock_as_irq);
diff --git a/drivers/gpio/gpiolib-sysfs.c b/drivers/gpio/gpiolib-sysfs.c
index f150aa288fa1..be45a9283c28 100644
--- a/drivers/gpio/gpiolib-sysfs.c
+++ b/drivers/gpio/gpiolib-sysfs.c
@@ -161,7 +161,7 @@ static int gpio_setup_irq(struct gpio_desc *desc, struct 
device *dev,
desc->flags &= ~GPIO_TRIGGER_MASK;
 
if (!gpio_flags) {
-   gpiod_unlock_as_irq(desc);
+   gpio_unlock_as_irq(desc->chip, gpio_chip_hwgpio(desc));
ret = 0;
goto free_id;
}
@@ -200,7 +200,7 @@ static int gpio_setup_irq(struct gpio_desc *desc, struct 
device *dev,
if (ret < 0)
goto free_id;
 
-   ret = gpiod_lock_as_irq(desc);
+   ret = gpio_lock_as_irq(desc->chip, gpio_chip_hwgpio(desc));
if (ret < 0) {
gpiod_warn(desc, "failed to flag the GPIO for IRQ\n");
goto free_id;
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 38d176e31379..7582207c92e7 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1428,44 +1428,46 @@ int gpiod_to_irq(const struct gpio_desc *desc)
 EXPORT_SYMBOL_GPL(gpiod_to_irq);
 
 /**
- * gpiod_lock_as_irq() - lock a GPIO to be used as IRQ
- * @gpio: the GPIO line to lock as used for IRQ
+ * gpio_lock_as_irq() - lock a GPIO to be used as IRQ
+ * @chip: the chip

[PATCH 5/5] gpio: move gpio_ensure_requested() into legacy C file

2014-07-22 Thread Alexandre Courbot

gpio_ensure_requested() only makes sense when using the integer-based
GPIO API, so make sure it is called from there instead of the gpiod
API which we know cannot be called with a non-requested GPIO anyway.

The uses of gpio_ensure_requested() in the gpiod API were kind of
out-of-place anyway, so putting them in gpio-legacy.c helps clearing the
code.

Actually, considering the time this ensure_requested mechanism has been
around, maybe we should just turn this patch into "remove
gpio_ensure_requested()" if we know for sure that no user depend on it
anymore?

Signed-off-by: Alexandre Courbot 
---
 drivers/gpio/gpiolib-legacy.c | 106 ++
 drivers/gpio/gpiolib.c| 129 ++
 include/asm-generic/gpio.h|  15 +
 3 files changed, 113 insertions(+), 137 deletions(-)

diff --git a/drivers/gpio/gpiolib-legacy.c b/drivers/gpio/gpiolib-legacy.c
index c3e3823a40b9..9da5deee716c 100644
--- a/drivers/gpio/gpiolib-legacy.c
+++ b/drivers/gpio/gpiolib-legacy.c
@@ -5,6 +5,64 @@
 
 #include "gpiolib.h"
 
+/* Warn when drivers omit gpio_request() calls -- legal but ill-advised
+ * when setting direction, and otherwise illegal.  Until board setup code
+ * and drivers use explicit requests everywhere (which won't happen when
+ * those calls have no teeth) we can't avoid autorequesting.  This nag
+ * message should motivate switching to explicit requests... so should
+ * the weaker cleanup after faults, compared to gpio_request().
+ *
+ * NOTE: the autorequest mechanism is going away; at this point it's
+ * only "legal" in the sense that (old) code using it won't break yet,
+ * but instead only triggers a WARN() stack dump.
+ */
+static int gpio_ensure_requested(struct gpio_desc *desc)
+{
+   struct gpio_chip *chip = desc->chip;
+   unsigned long flags;
+   bool request = false;
+   int err = 0;
+
+   spin_lock_irqsave(&gpio_lock, flags);
+
+   if (WARN(test_and_set_bit(FLAG_REQUESTED, &desc->flags) == 0,
+   "autorequest GPIO-%d\n", desc_to_gpio(desc))) {
+   if (!try_module_get(chip->owner)) {
+   gpiod_err(desc, "%s: module can't be gotten\n",
+   __func__);
+   clear_bit(FLAG_REQUESTED, &desc->flags);
+   /* lose */
+   err = -EIO;
+   goto end;
+   }
+   desc->label = "[auto]";
+   /* caller must chip->request() w/o spinlock */
+   if (chip->request)
+   request = true;
+   }
+
+end:
+   spin_unlock_irqrestore(&gpio_lock, flags);
+
+   if (request) {
+   might_sleep_if(chip->can_sleep);
+   err = chip->request(chip, gpio_chip_hwgpio(desc));
+
+   if (err < 0) {
+   gpiod_dbg(desc, "%s: chip request fail, %d\n",
+   __func__, err);
+   spin_lock_irqsave(&gpio_lock, flags);
+
+   desc->label = NULL;
+   clear_bit(FLAG_REQUESTED, &desc->flags);
+
+   spin_unlock_irqrestore(&gpio_lock, flags);
+   }
+   }
+
+   return err;
+}
+
 void gpio_free(unsigned gpio)
 {
gpiod_free(gpio_to_desc(gpio));
@@ -97,3 +155,51 @@ void gpio_free_array(const struct gpio *array, size_t num)
gpio_free((array++)->gpio);
 }
 EXPORT_SYMBOL_GPL(gpio_free_array);
+
+int gpio_direction_input(unsigned gpio)
+{
+   struct gpio_desc *desc = gpio_to_desc(gpio);
+   int err;
+
+   if (!desc)
+   return -EINVAL;
+
+   err = gpio_ensure_requested(desc);
+   if (err < 0)
+   return err;
+
+   return gpiod_direction_input(desc);
+}
+EXPORT_SYMBOL_GPL(gpio_direction_input);
+
+int gpio_direction_output(unsigned gpio, int value)
+{
+   struct gpio_desc *desc = gpio_to_desc(gpio);
+   int err;
+
+   if (!desc)
+   return -EINVAL;
+
+   err = gpio_ensure_requested(desc);
+   if (err < 0)
+   return err;
+
+   return gpiod_direction_output_raw(desc, value);
+}
+EXPORT_SYMBOL_GPL(gpio_direction_output);
+
+int gpio_set_debounce(unsigned gpio, unsigned debounce)
+{
+   struct gpio_desc *desc = gpio_to_desc(gpio);
+   int err;
+
+   if (!desc)
+   return -EINVAL;
+
+   err = gpio_ensure_requested(desc);
+   if (err < 0)
+   return err;
+
+   return gpiod_set_debounce(desc, debounce);
+}
+EXPORT_SYMBOL_GPL(gpio_set_debounce);
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 7582207c92e7..412d64e93cfb 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -95,39 +95,6 @@ int desc_to_gpio(const struct gpio_desc *desc)
 EXPORT_SYMBOL_GPL(desc_to_gpio);
 
 
-/* Warn when drivers omit gpio_request() calls -- legal but ill-advised
- * when setting

[PATCH 0/5] gpio: a few cleanup patches

2014-07-22 Thread Alexandre Courbot

Still in order to prepare for the ability to share one GPIO between
several consumers, this series of mostly unrelated patches fixes a
few minor issues. Most of the patches should be no-brainers ; maybe
patch 2 should be looked more closely in order to understand why this
code was there in the first place. Patch 4 is not only a simplification
of the API, but a hard requirement if we are to allow several GPIO
descriptors to manipulate the same GPIO, as no driver function should
require a descriptor to perform properly.

This series has been tested on Raspberry Pi and Jetson TK1 without any
problem being noticed.

Alexandre Courbot (5):
  gpio: remove export of private of_get_named_gpio_flags()
  gpio: simplify gpiochip_export()
  gpio: make gpiochip_get_desc() gpiolib-private
  gpio: remove gpiod_lock/unlock_as_irq()
  gpio: move gpio_ensure_requested() into legacy C file

 Documentation/gpio/driver.txt |   4 +-
 drivers/gpio/gpiolib-acpi.c   |   6 +-
 drivers/gpio/gpiolib-legacy.c | 106 ++--
 drivers/gpio/gpiolib-of.c |   3 +-
 drivers/gpio/gpiolib-sysfs.c  |  24 +++
 drivers/gpio/gpiolib.c| 160 ++
 drivers/gpio/gpiolib.h|   2 +
 include/asm-generic/gpio.h|  18 +
 include/linux/gpio/driver.h   |   7 +-
 9 files changed, 144 insertions(+), 186 deletions(-)

-- 
2.0.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/5] gpio: remove export of private of_get_named_gpio_flags()

2014-07-22 Thread Alexandre Courbot

of_get_named_gpio_flags() has been made gpiolib-private by commit
f01d907582, but its EXPORT statement has not been removed. Fix this.

Signed-off-by: Alexandre Courbot 
---
 drivers/gpio/gpiolib-of.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c
index e60cdab1d15e..3e2fae205bee 100644
--- a/drivers/gpio/gpiolib-of.c
+++ b/drivers/gpio/gpiolib-of.c
@@ -95,7 +95,6 @@ struct gpio_desc *of_get_named_gpiod_flags(struct device_node 
*np,
 PTR_ERR_OR_ZERO(gg_data.out_gpio));
return gg_data.out_gpio;
 }
-EXPORT_SYMBOL(of_get_named_gpiod_flags);
 
 int of_get_named_gpio_flags(struct device_node *np, const char *list_name,
int index, enum of_gpio_flags *flags)
-- 
2.0.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Staging: gdm724x: gdm_lte.c: Fix warning of prefer ether_addr_copy()

2014-07-22 Thread Greg KH

On Tue, Jul 22, 2014 at 12:26:42PM +0530, Kiran Padwal wrote:
> This patch fixes the following checkpatch.pl warnings:
> WARNING: "Prefer ether_addr_copy() over memcpy() if the Ethernet addresses 
> are __aligned(2)".

Is that true here?

Have you tested this on the hardware to ensure it works properly?

I hate that checkpatch message...

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 0/5] phy: miphy365x: Introduce support for MiPHY365x

2014-07-22 Thread Kishon Vijay Abraham I

Hi,

On Monday 21 July 2014 01:09 PM, Lee Jones wrote:
> Hi Kishon,
> 
>>> This patchset is based on the two core patches you sent to the
>>> list which facilitate creating PHYs residing on multi-channel
>>> controllers.  The changes since the last submission centre
>>> around dynamic PHY creation based solely on what is provided via
>>> Device Tree, as requested.  The other review comments have also
>>> been addressed in this set.
>>
>> Merged the first four patches of this series. (merged v3+1 for 1st patch).
>> I'm not sure who would be merging the dt patch (5/5). But for now I've merged
>> 2/5 in linux-phy tree since the phy patch seems to be dependent on it.
>> So the dt patch can be merged only after -rc1. However if whoever merges
>> dt wants to get it merged during the merge window, I can prepare a common
>> branch which both of us can merge to. Either ways, dt Maintainer should
>> let me know.
> 
> Thanks for applying.  It would be really helpful if you could put just
> patch 2/5 onto an immutable branch and provide a tag e that Maxime
> could pull from.

Sure. Created tag
git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy.git
miphy365x_dt_common_header (branch:miphy365x_dt_header). It is based on 
3.16-rc5.

Thanks
Kishon
> 
> Kind regards,
> Lee
> 
>>>  .../devicetree/bindings/phy/phy-miphy365x.txt  |  76 +++
>>>  arch/arm/boot/dts/stih416-b2020-revE.dts   |  10 +
>>>  arch/arm/boot/dts/stih416-b2020.dts|  12 +
>>>  arch/arm/boot/dts/stih416.dtsi |  22 +
>>>  drivers/phy/Kconfig|  10 +
>>>  drivers/phy/Makefile   |   1 +
>>>  drivers/phy/phy-miphy365x.c| 636 
>>> +
>>>  include/dt-bindings/phy/phy-miphy365x.h|  14 +
>>>  8 files changed, 781 insertions(+)
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-22 Thread Daniel Vetter

On Mon, Jul 21, 2014 at 10:23:43PM +0300, Oded Gabbay wrote:
> But Jerome, the core problem still remains in effect, even with your
> suggestion. If an application, either via userspace queue or via ioctl,
> submits a long-running kernel, than the CPU in general can't stop the
> GPU from running it. And if that kernel does while(1); than that's it,
> game's over, and no matter how you submitted the work. So I don't really
> see the big advantage in your proposal. Only in CZ we can stop this wave
> (by CP H/W scheduling only). What are you saying is basically I won't
> allow people to use compute on Linux KV system because it _may_ get the
> system stuck.
> 
> So even if I really wanted to, and I may agree with you theoretically on
> that, I can't fulfill your desire to make the "kernel being able to
> preempt at any time and be able to decrease or increase user queue
> priority so overall kernel is in charge of resources management and it
> can handle rogue client in proper fashion". Not in KV, and I guess not
> in CZ as well.

At least on intel the execlist stuff which is used for preemption can be
used by both the cpu and the firmware scheduler. So we can actually
preempt when doing cpu scheduling.

It sounds like current amd hw doesn't have any preemption at all. And
without preemption I don't think we should ever consider to allow
userspace to directly submit stuff to the hw and overload. Imo the kernel
_must_ sit in between and reject clients that don't behave. Of course you
can only ever react (worst case with a gpu reset, there's code floating
around for that on intel-gfx), but at least you can do something.

If userspace has a direct submit path to the hw then this gets really
tricky, if not impossible.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [STLinux Kernel] [PATCH v3 0/5] phy: miphy365x: Introduce support for MiPHY365x

2014-07-22 Thread Maxime Coquelin




On 07/22/2014 09:23 AM, Kishon Vijay Abraham I wrote:

Hi,

On Monday 21 July 2014 01:09 PM, Lee Jones wrote:

Hi Kishon,


This patchset is based on the two core patches you sent to the
list which facilitate creating PHYs residing on multi-channel
controllers.  The changes since the last submission centre
around dynamic PHY creation based solely on what is provided via
Device Tree, as requested.  The other review comments have also
been addressed in this set.


Merged the first four patches of this series. (merged v3+1 for 1st patch).
I'm not sure who would be merging the dt patch (5/5). But for now I've merged
2/5 in linux-phy tree since the phy patch seems to be dependent on it.
So the dt patch can be merged only after -rc1. However if whoever merges
dt wants to get it merged during the merge window, I can prepare a common
branch which both of us can merge to. Either ways, dt Maintainer should
let me know.


Thanks for applying.  It would be really helpful if you could put just
patch 2/5 onto an immutable branch and provide a tag e that Maxime
could pull from.


Sure. Created tag
git://git.kernel.org/pub/scm/linux/kernel/git/kishon/linux-phy.git
miphy365x_dt_common_header (branch:miphy365x_dt_header). It is based on 
3.16-rc5.


Thanks Kishon, I will merge it today.

Regards,
Maxime



Thanks
Kishon


Kind regards,
Lee


  .../devicetree/bindings/phy/phy-miphy365x.txt  |  76 +++
  arch/arm/boot/dts/stih416-b2020-revE.dts   |  10 +
  arch/arm/boot/dts/stih416-b2020.dts|  12 +
  arch/arm/boot/dts/stih416.dtsi |  22 +
  drivers/phy/Kconfig|  10 +
  drivers/phy/Makefile   |   1 +
  drivers/phy/phy-miphy365x.c| 636 +
  include/dt-bindings/phy/phy-miphy365x.h|  14 +
  8 files changed, 781 insertions(+)




___
Kernel mailing list
ker...@stlinux.com
http://www.stlinux.com/mailman/listinfo/kernel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-22 Thread Daniel Vetter

On Mon, Jul 21, 2014 at 03:03:07PM -0400, Jerome Glisse wrote:
> On Mon, Jul 21, 2014 at 09:41:29PM +0300, Oded Gabbay wrote:
> > On 21/07/14 21:22, Daniel Vetter wrote:
> > > On Mon, Jul 21, 2014 at 7:28 PM, Oded Gabbay  wrote:
> > >>> I'm not sure whether we can do the same trick with the hw scheduler. But
> > >>> then unpinning hw contexts will drain the pipeline anyway, so I guess we
> > >>> can just stop feeding the hw scheduler until it runs dry. And then unpin
> > >>> and evict.
> > >> So, I'm afraid but we can't do this for AMD Kaveri because:
> > > 
> > > Well as long as you can drain the hw scheduler queue (and you can do
> > > that, worst case you have to unmap all the doorbells and other stuff
> > > to intercept further submission from userspace) you can evict stuff.
> > 
> > I can't drain the hw scheduler queue, as I can't do mid-wave preemption.
> > Moreover, if I use the dequeue request register to preempt a queue
> > during a dispatch it may be that some waves (wave groups actually) of
> > the dispatch have not yet been created, and when I reactivate the mqd,
> > they should be created but are not. However, this works fine if you use
> > the HIQ. the CP ucode correctly saves and restores the state of an
> > outstanding dispatch. I don't think we have access to the state from
> > software at all, so it's not a bug, it is "as designed".
> > 
> 
> I think here Daniel is suggesting to unmapp the doorbell page, and track
> each write made by userspace to it and while unmapped wait for the gpu to
> drain or use some kind of fence on a special queue. Once GPU is drain we
> can move pinned buffer, then remap the doorbell and update it to the last
> value written by userspace which will resume execution to the next job.

Exactly, just prevent userspace from submitting more. And if you have
misbehaving userspace that submits too much, reset the gpu and tell it
that you're sorry but won't schedule any more work.

We have this already in i915 (since like all other gpus we're not
preempting right now) and it works. There's some code floating around to
even restrict the reset to _just_ the offending submission context, with
nothing else getting corrupted.

You can do all this with the doorbells and unmapping them, but it's a
pain. Much easier if you have a real ioctl, and I haven't seen anyone with
perf data indicating that an ioctl would be too much overhead on linux.
Neither in this thread nor internally here at intel.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v8 02/11] power: reset: Add reboot driver for brcmstb

2014-07-22 Thread Arnd Bergmann

On Monday 21 July 2014 14:07:57 Brian Norris wrote:
> diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig
> index 0073633e7699..9782e8d80647 100644
> --- a/arch/arm/mach-bcm/Kconfig
> +++ b/arch/arm/mach-bcm/Kconfig
> @@ -94,6 +94,7 @@ config ARCH_BRCMSTB
> select MIGHT_HAVE_PCI
> select HAVE_SMP
> select HAVE_ARM_ARCH_TIMER
> +   select POWER_RESET_BRCMSTB
> help
>   Say Y if you intend to run the kernel on a Broadcom ARM-based STB
>   chipset.
> diff --git a/drivers/power/reset/Kconfig b/drivers/power/reset/Kconfig
> index bdcf5173e377..fcb9825debe5 100644
> --- a/drivers/power/reset/Kconfig
> +++ b/drivers/power/reset/Kconfig
> @@ -20,6 +20,16 @@ config POWER_RESET_AXXIA
>  
>   Say Y if you have an Axxia family SoC.
>  
> +config POWER_RESET_BRCMSTB
> +   bool "Broadcom STB reset driver"
> +   depends on POWER_RESET && ARCH_BRCMSTB
> +   help
> + This driver provides restart support for ARM-based Broadcom STB
> + boards.
> +
> + Say Y here if you have an ARM-based Broadcom STB board and you wish
> + to have restart support.
> +
>  config POWER_RESET_GPIO
> bool "GPIO power-off driver"
> depends on OF_GPIO && POWER_RESET
> 

(nitpicking)

You shouldn't have both a user-selectable option and 'select' it from
the platform, because it makes it inherently not selectable, in particular
in the combination with 'depends on ARCH_BRCMSTB'.

One way to solve this would be to change the dependency to

config POWER_RESET_BRCMSTB
bool "Broadcom STB reset driver"
depends on POWER_RESET && ARM
depends on ARCH_BRCMSTB || COMPILE_TEST

which in effect would allow building it on any ARM machine as long as
COMPILE_TEST is set (which normally is not).

The same could be expressed using

config POWER_RESET_BRCMSTB
bool "Broadcom STB reset driver" if COMPILE_TEST
depends on POWER_RESET && ARM

My preference in this case however would be to just drop the 'select'
statement and add the driver to the defconfig file.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-22 Thread Peter Zijlstra

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
> Hello,
> 
> This patch try to solve problem that a long-lasting page cache of
> ext4 superblock disturbs page migration.
> 
> I've been testing CMA feature on my ARM-based platform
> and found some pages for page caches cannot be migrated.
> Some of them are page caches of superblock of ext4 filesystem.
> 
> Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
> from movable area. But the problem is that ext4 hold the page until
> it is unmounted. If root filesystem is ext4 the page cannot be migrated 
> forever.
> 
> I introduce a new API for allocating page from non-movable area.
> It is useful for ext4 and others that want to hold page cache for a long time.

There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/5] futex: introduce an optimistic spinning futex

2014-07-22 Thread Peter Zijlstra

On Mon, Jul 21, 2014 at 05:32:55PM -0700, Davidlohr Bueso wrote:
> On Mon, 2014-07-21 at 14:31 -0700, Andy Lutomirski wrote:
> > On Mon, Jul 21, 2014 at 2:27 PM, Peter Zijlstra  
> > wrote:
> > > All this is predicated on the fact that syscalls are 'expensive'.
> > > Weren't syscalls only 100s of cycles? All this bitmap mucking is far
> > > more expensive due to cacheline misses, which due to the size of the
> > > things is almost guaranteed.
> > 
> > 120 - 300 cycles for me, unless tracing happens, and I'm working on
> > reducing the incidence of tracing.
> 
> fwiw here's what lmbench's lat_ctx says on my system . For 'accuracy', I
> kept the runs short.
> 
> http://www.stgolabs.net/lat_ctx.png

Create a syscall without 'work body', so say something like the below.
Context switches do quite a lot of work (unfortunately).

diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
index ec255a1646d2..daf27ec10e29 100644
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -323,6 +323,7 @@
 314common  sched_setattr   sys_sched_setattr
 315common  sched_getattr   sys_sched_getattr
 316common  renameat2   sys_renameat2
+317common  nop sys_nop
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/kernel/sys.c b/kernel/sys.c
index 66a751ebf9d9..1b86614bb551 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2108,6 +2108,11 @@ SYSCALL_DEFINE1(sysinfo, struct sysinfo __user *, info)
return 0;
 }
 
+SYSCALL_DEFINE0(nop)
+{
+   return 0;
+}
+
 #ifdef CONFIG_COMPAT
 struct compat_sysinfo {
s32 uptime;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v8 00/11] ARM: brcmstb: Add Broadcom STB SoC support

2014-07-22 Thread Arnd Bergmann

On Monday 21 July 2014 14:07:55 Brian Norris wrote:
> I'm taking over the latest resubmission of this patch series.
> There are a few moderate changes for v8 (noted below), but we
> are waiting mostly for an Ack for the reboot driver.
> 
> This patchset contains the board support package for the
> Broadcom BCM7445 ARM-based SoC [1]. These changes contain a
> minimal set of code needed for a BCM7445-based board to boot
> the Linux kernel.
> 
> These changes heavily leverage the OF/devicetree framework. The
> machine is also built into the multi-platform ARMv7 image.
> 
> Changes are also available here:
> 
>   https://github.com/brcm/linux/tree/brcmstb-v8
>   git://github.com/brcm/linux.git +brcmstb-v8

Whole series

Acked-by: Arnd Bergmann 

I think we should try to get this merged into 3.17, it's already
taken too long and the patches look good.

Please add the core architecture patches for arch/arm into Russell's
patch tracker http://www.arm.linux.org.uk/developer/patches/.

For the platform changes in the first patch, I would prefer to have
Matt pick up the first patch, but we can also apply it directly into
arm-soc if he prefers that.

The reset driver can ideally go through the drivers/power/ maintainers,
but if they are not interested in merging it, we can also take that
through arm-soc. See also my one comment on that driver.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-22 Thread Daniel Vetter

On Tue, Jul 22, 2014 at 09:28:51AM +0200, Daniel Vetter wrote:
> On Mon, Jul 21, 2014 at 03:03:07PM -0400, Jerome Glisse wrote:
> > On Mon, Jul 21, 2014 at 09:41:29PM +0300, Oded Gabbay wrote:
> > > On 21/07/14 21:22, Daniel Vetter wrote:
> > > > On Mon, Jul 21, 2014 at 7:28 PM, Oded Gabbay  
> > > > wrote:
> > > >>> I'm not sure whether we can do the same trick with the hw scheduler. 
> > > >>> But
> > > >>> then unpinning hw contexts will drain the pipeline anyway, so I guess 
> > > >>> we
> > > >>> can just stop feeding the hw scheduler until it runs dry. And then 
> > > >>> unpin
> > > >>> and evict.
> > > >> So, I'm afraid but we can't do this for AMD Kaveri because:
> > > > 
> > > > Well as long as you can drain the hw scheduler queue (and you can do
> > > > that, worst case you have to unmap all the doorbells and other stuff
> > > > to intercept further submission from userspace) you can evict stuff.
> > > 
> > > I can't drain the hw scheduler queue, as I can't do mid-wave preemption.
> > > Moreover, if I use the dequeue request register to preempt a queue
> > > during a dispatch it may be that some waves (wave groups actually) of
> > > the dispatch have not yet been created, and when I reactivate the mqd,
> > > they should be created but are not. However, this works fine if you use
> > > the HIQ. the CP ucode correctly saves and restores the state of an
> > > outstanding dispatch. I don't think we have access to the state from
> > > software at all, so it's not a bug, it is "as designed".
> > > 
> > 
> > I think here Daniel is suggesting to unmapp the doorbell page, and track
> > each write made by userspace to it and while unmapped wait for the gpu to
> > drain or use some kind of fence on a special queue. Once GPU is drain we
> > can move pinned buffer, then remap the doorbell and update it to the last
> > value written by userspace which will resume execution to the next job.
> 
> Exactly, just prevent userspace from submitting more. And if you have
> misbehaving userspace that submits too much, reset the gpu and tell it
> that you're sorry but won't schedule any more work.
> 
> We have this already in i915 (since like all other gpus we're not
> preempting right now) and it works. There's some code floating around to
> even restrict the reset to _just_ the offending submission context, with
> nothing else getting corrupted.
> 
> You can do all this with the doorbells and unmapping them, but it's a
> pain. Much easier if you have a real ioctl, and I haven't seen anyone with
> perf data indicating that an ioctl would be too much overhead on linux.
> Neither in this thread nor internally here at intel.

Aside: Another reason why the ioctl is better than the doorbell is
integration with other drivers. Yeah I know this is about compute, but
sooner or later someone will want to e.g. post-proc video frames between
the v4l capture device and the gpu mpeg encoder. Or something else fancy.

Then you want to be able to somehow integrate into a cross-driver fence
framework like android syncpts, and you can't do that without an ioctl for
the compute submissions.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH v3 1/6] charger: max14577: Add support for MAX77836 charger

2014-07-22 Thread Krzysztof Kozlowski

Add support for MAX77836 charger to the max14577 driver. The MAX77836
charger is almost the same as 14577 model except:
 - No dead-battery detection;
 - Support for special charger (like in MAX77693);
 - Support for DX over-voltage protection (like in MAX77693);
 - Lower values of charging current (two times lower current for
   slow/fast charge, much lower EOC current);
 - Slightly different values in ChgTyp field of STATUS2 register. On
   MAX14577 0x6 is reserved and 0x7 dead battery. On the MAX77836 the
   0x6 means special charger and 0x7 is reserved. Regardless of these
   differences the driver maps them to one enum max14577_muic_charger_type.

Signed-off-by: Krzysztof Kozlowski 
Cc: Kyungmin Park 
Cc: Anton Vorontsov 
Cc: Dmitry Eremin-Solenikov 
Cc: David Woodhouse 
Acked-by: Lee Jones 
---
 drivers/power/Kconfig|  4 +-
 drivers/power/max14577_charger.c | 77 +---
 include/linux/mfd/max14577-private.h | 54 ++---
 3 files changed, 104 insertions(+), 31 deletions(-)

diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig
index ba6975123071..94086a5238c6 100644
--- a/drivers/power/Kconfig
+++ b/drivers/power/Kconfig
@@ -318,11 +318,11 @@ config CHARGER_MANAGER
   with help of suspend_again support.
 
 config CHARGER_MAX14577
-   tristate "Maxim MAX14577 MUIC battery charger driver"
+   tristate "Maxim MAX14577/77836 battery charger driver"
depends on MFD_MAX14577
help
  Say Y to enable support for the battery charger control sysfs and
- platform data of MAX14577 MUICs.
+ platform data of MAX14577/77836 MUICs.
 
 config CHARGER_MAX8997
tristate "Maxim MAX8997/MAX8966 PMIC battery charger driver"
diff --git a/drivers/power/max14577_charger.c b/drivers/power/max14577_charger.c
index fad2a75b3604..19c8f42abf24 100644
--- a/drivers/power/max14577_charger.c
+++ b/drivers/power/max14577_charger.c
@@ -1,7 +1,7 @@
 /*
- * Battery charger driver for the Maxim 14577
+ * max14577_charger.c - Battery charger driver for the Maxim 14577/77836
  *
- * Copyright (C) 2013 Samsung Electronics
+ * Copyright (C) 2013,2014 Samsung Electronics
  * Krzysztof Kozlowski 
  *
  * This program is free software; you can redistribute it and/or modify
@@ -25,10 +25,35 @@ struct max14577_charger {
struct max14577 *max14577;
struct power_supply charger;
 
-   unsigned intcharging_state;
-   unsigned intbattery_state;
+   unsigned intcharging_state;
+   unsigned intbattery_state;
 };
 
+/*
+ * Helper function for mapping values of STATUS2/CHGTYP register on max14577
+ * and max77836 chipsets to enum maxim_muic_charger_type.
+ */
+static enum max14577_muic_charger_type maxim_get_charger_type(
+   enum maxim_device_type dev_type, u8 val) {
+   switch (val) {
+   case MAX14577_CHARGER_TYPE_NONE:
+   case MAX14577_CHARGER_TYPE_USB:
+   case MAX14577_CHARGER_TYPE_DOWNSTREAM_PORT:
+   case MAX14577_CHARGER_TYPE_DEDICATED_CHG:
+   case MAX14577_CHARGER_TYPE_SPECIAL_500MA:
+   case MAX14577_CHARGER_TYPE_SPECIAL_1A:
+   return val;
+   case MAX14577_CHARGER_TYPE_DEAD_BATTERY:
+   case MAX14577_CHARGER_TYPE_RESERVED:
+   if (dev_type == MAXIM_DEVICE_TYPE_MAX77836)
+   val |= 0x8;
+   return val;
+   default:
+   WARN_ONCE(1, "max14577: Unsupported chgtyp register value 
0x%02x", val);
+   return val;
+   }
+}
+
 static int max14577_get_charger_state(struct max14577_charger *chg)
 {
struct regmap *rmap = chg->max14577->regmap;
@@ -89,19 +114,23 @@ static int max14577_get_online(struct max14577_charger 
*chg)
 {
struct regmap *rmap = chg->max14577->regmap;
u8 reg_data;
+   enum max14577_muic_charger_type chg_type;
 
max14577_read_reg(rmap, MAX14577_MUIC_REG_STATUS2, ®_data);
reg_data = ((reg_data & STATUS2_CHGTYP_MASK) >> STATUS2_CHGTYP_SHIFT);
-   switch (reg_data) {
+   chg_type = maxim_get_charger_type(chg->max14577->dev_type, reg_data);
+   switch (chg_type) {
case MAX14577_CHARGER_TYPE_USB:
case MAX14577_CHARGER_TYPE_DEDICATED_CHG:
case MAX14577_CHARGER_TYPE_SPECIAL_500MA:
case MAX14577_CHARGER_TYPE_SPECIAL_1A:
case MAX14577_CHARGER_TYPE_DEAD_BATTERY:
+   case MAX77836_CHARGER_TYPE_SPECIAL_BIAS:
return 1;
case MAX14577_CHARGER_TYPE_NONE:
case MAX14577_CHARGER_TYPE_DOWNSTREAM_PORT:
case MAX14577_CHARGER_TYPE_RESERVED:
+   case MAX77836_CHARGER_TYPE_RESERVED:
default:
return 0;
}
@@ -118,10 +147,12 @@ static int max14577_get_battery_health(struct 
max14577_charger *chg)
struct regmap *rmap = chg->max14577->regmap;
int state = POWER_SUPPLY_HEALTH_GOOD;
u8 reg_data;
+   enum max14577_muic_charger_type chg_ty

[RESEND PATCH v3 4/6] power: max17040: Add ID for MAX77836 Fuel Gauge block

2014-07-22 Thread Krzysztof Kozlowski

MAX77836 has the same Fuel Gauge as MAX17040/17048. The max17040 driver
can be safely re-used. The patch adds MAX77836 ID to array of
i2c_device_id.

Signed-off-by: Krzysztof Kozlowski 
Cc: Kyungmin Park 
Cc: Anton Vorontsov 
Cc: Dmitry Eremin-Solenikov 
Cc: David Woodhouse 
---
 drivers/power/max17040_battery.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/power/max17040_battery.c b/drivers/power/max17040_battery.c
index 0fbac861080d..165ffe381803 100644
--- a/drivers/power/max17040_battery.c
+++ b/drivers/power/max17040_battery.c
@@ -278,6 +278,7 @@ static SIMPLE_DEV_PM_OPS(max17040_pm_ops, max17040_suspend, 
max17040_resume);
 
 static const struct i2c_device_id max17040_id[] = {
{ "max17040", 0 },
+   { "max77836-battery", 0 },
{ }
 };
 MODULE_DEVICE_TABLE(i2c, max17040_id);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH v3 3/6] charger: max14577: Configure battery-dependent settings from DTS and sysfs

2014-07-22 Thread Krzysztof Kozlowski

Remove hard-coded values for:
 - Fast Charge current,
 - End Of Charge current,
 - Fast Charge timer,
 - Overvoltage Protection Threshold,
 - Battery Constant Voltage,
and use DTS or sysfs to configure them. This allows using the max14577 charger
driver with different batteries.

Now the charger driver requires valid configuration data from DTS. In
case of wrong configuration data it fails during probe. Patch adds
of_compatible to the charger mfd cell in MFD driver core.

The fast charge timer is configured through sysfs entry.

Signed-off-by: Krzysztof Kozlowski 
Cc: Kyungmin Park 
Cc: Dmitry Eremin-Solenikov 
Cc: David Woodhouse 
Cc: Jenny Tc 
Cc: Mark Rutland 
Acked-by: Lee Jones 
---
 drivers/mfd/max14577.c   |   5 +-
 drivers/power/Kconfig|   1 +
 drivers/power/max14577_charger.c | 311 +++
 include/linux/mfd/max14577-private.h |  19 +++
 include/linux/mfd/max14577.h |   7 +
 5 files changed, 314 insertions(+), 29 deletions(-)

diff --git a/drivers/mfd/max14577.c b/drivers/mfd/max14577.c
index e6f25aa0ded8..b8af263be594 100644
--- a/drivers/mfd/max14577.c
+++ b/drivers/mfd/max14577.c
@@ -116,7 +116,10 @@ static const struct mfd_cell max14577_devs[] = {
.name = "max14577-regulator",
.of_compatible = "maxim,max14577-regulator",
},
-   { .name = "max14577-charger", },
+   {
+   .name = "max14577-charger",
+   .of_compatible = "maxim,max14577-charger",
+   },
 };
 
 static const struct mfd_cell max77836_devs[] = {
diff --git a/drivers/power/Kconfig b/drivers/power/Kconfig
index 94086a5238c6..6c152f494c4a 100644
--- a/drivers/power/Kconfig
+++ b/drivers/power/Kconfig
@@ -320,6 +320,7 @@ config CHARGER_MANAGER
 config CHARGER_MAX14577
tristate "Maxim MAX14577/77836 battery charger driver"
depends on MFD_MAX14577
+   select SYSFS
help
  Say Y to enable support for the battery charger control sysfs and
  platform data of MAX14577/77836 MUICs.
diff --git a/drivers/power/max14577_charger.c b/drivers/power/max14577_charger.c
index 19c8f42abf24..c125756eab69 100644
--- a/drivers/power/max14577_charger.c
+++ b/drivers/power/max14577_charger.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct max14577_charger {
struct device *dev;
@@ -27,6 +28,8 @@ struct max14577_charger {
 
unsigned intcharging_state;
unsigned intbattery_state;
+
+   struct max14577_charger_platform_data   *pdata;
 };
 
 /*
@@ -178,15 +181,131 @@ static int max14577_get_present(struct max14577_charger 
*chg)
return 1;
 }
 
+static inline int max14577_set_fast_charge_timer(struct max14577_charger *chg,
+   unsigned long hours)
+{
+   u8 reg_data;
+
+   switch (hours) {
+   case 5 ... 7:
+   reg_data = hours - 3;
+   break;
+   case 0:
+   /* Disable */
+   reg_data = 0x7;
+   break;
+   default:
+   dev_err(chg->dev, "Wrong value for Fast-Charge Timer: %lu\n",
+   hours);
+   return -EINVAL;
+   }
+   reg_data <<= CHGCTRL1_TCHW_SHIFT;
+
+   return max14577_update_reg(chg->max14577->regmap,
+   MAX14577_REG_CHGCTRL1, CHGCTRL1_TCHW_MASK, reg_data);
+}
+
+static inline int max14577_init_constant_voltage(struct max14577_charger *chg,
+   unsigned int uvolt)
+{
+   u8 reg_data;
+
+   if (uvolt < MAXIM_CHARGER_CONSTANT_VOLTAGE_MIN ||
+   uvolt > MAXIM_CHARGER_CONSTANT_VOLTAGE_MAX)
+   return -EINVAL;
+
+   if (uvolt == 420)
+   reg_data = 0x0;
+   else if (uvolt == MAXIM_CHARGER_CONSTANT_VOLTAGE_MAX)
+   reg_data = 0x1f;
+   else if (uvolt <= 428) {
+   unsigned int val = uvolt;
+
+   val -= MAXIM_CHARGER_CONSTANT_VOLTAGE_MIN;
+   val /= MAXIM_CHARGER_CONSTANT_VOLTAGE_STEP;
+   if (uvolt <= 418)
+   reg_data = 0x1 + val;
+   else
+   reg_data = val; /* Fix for gap between 4.18V and 4.22V 
*/
+   } else
+   return -EINVAL;
+
+   reg_data <<= CHGCTRL3_MBCCVWRC_SHIFT;
+
+   return max14577_write_reg(chg->max14577->regmap,
+   MAX14577_CHG_REG_CHG_CTRL3, reg_data);
+}
+
+static inline int max14577_init_eoc(struct max14577_charger *chg,
+   unsigned int uamp)
+{
+   unsigned int current_bits = 0xf;
+   u8 reg_data;
+
+   switch (chg->max14577->dev_type) {
+   case MAXIM_DEVICE_TYPE_MAX77836:
+   if (uamp < 5000)
+   return -EINVAL; /* Requested current is too low */
+
+   if (uamp >= 7500 && uamp < 1)
+   current_bits = 0x0;
+   else if (uamp <= 5) {
+

[RESEND PATCH v3 2/6] regulator/mfd: max14577: Export symbols for calculating charger current

2014-07-22 Thread Krzysztof Kozlowski

This patch prepares for changing the max14577 charger driver to allow
configuring battery-dependent settings from DTS.

The patch moves from regulator driver to MFD core driver and exports:
 - function for calculating register value for charger's current;
 - table of limits for chargers (MAX14577, MAX77836).

Previously they were used only by the max14577 regulator driver. In next
patch the charger driver will use them as well. Exporting them will
reduce unnecessary code duplication.

Signed-off-by: Krzysztof Kozlowski 
Cc: Kyungmin Park 
Acked-by: Mark Brown 
Acked-by: Lee Jones 
---
 drivers/mfd/max14577.c   | 95 
 drivers/regulator/max14577.c | 80 ++
 include/linux/mfd/max14577-private.h | 22 -
 include/linux/mfd/max14577.h | 23 +
 4 files changed, 133 insertions(+), 87 deletions(-)

diff --git a/drivers/mfd/max14577.c b/drivers/mfd/max14577.c
index 4a5e885383f8..e6f25aa0ded8 100644
--- a/drivers/mfd/max14577.c
+++ b/drivers/mfd/max14577.c
@@ -26,6 +26,87 @@
 #include 
 #include 
 
+/*
+ * Table of valid charger currents for different Maxim chipsets.
+ * It is placed here because it is used by both charger and regulator driver.
+ */
+const struct maxim_charger_current maxim_charger_currents[] = {
+   [MAXIM_DEVICE_TYPE_UNKNOWN] = { 0, 0, 0, 0 },
+   [MAXIM_DEVICE_TYPE_MAX14577] = {
+   .min= MAX14577_CHARGER_CURRENT_LIMIT_MIN,
+   .high_start = MAX14577_CHARGER_CURRENT_LIMIT_HIGH_START,
+   .high_step  = MAX14577_CHARGER_CURRENT_LIMIT_HIGH_STEP,
+   .max= MAX14577_CHARGER_CURRENT_LIMIT_MAX,
+   },
+   [MAXIM_DEVICE_TYPE_MAX77836] = {
+   .min= MAX77836_CHARGER_CURRENT_LIMIT_MIN,
+   .high_start = MAX77836_CHARGER_CURRENT_LIMIT_HIGH_START,
+   .high_step  = MAX77836_CHARGER_CURRENT_LIMIT_HIGH_STEP,
+   .max= MAX77836_CHARGER_CURRENT_LIMIT_MAX,
+   },
+};
+EXPORT_SYMBOL_GPL(maxim_charger_currents);
+
+/*
+ * maxim_charger_calc_reg_current - Calculate register value for current
+ * @limits:constraints for charger, matching the MBCICHWRC register
+ * @min_ua:minimal requested current, micro Amps
+ * @max_ua:maximum requested current, micro Amps
+ * @dst:   destination to store calculated register value
+ *
+ * Calculates the value of MBCICHWRC (Fast Battery Charge Current) register
+ * for given current and stores it under pointed 'dst'. The stored value
+ * combines low bit (MBCICHWRCL) and high bits (MBCICHWRCH). It is also
+ * properly shifted.
+ *
+ * The calculated register value matches the current which:
+ *  - is always between ;
+ *  - is always less or equal to max_ua;
+ *  - is the highest possible value;
+ *  - may be lower than min_ua.
+ *
+ * On success returns 0. On error returns -EINVAL (requested min/max current
+ * is outside of given charger limits) and 'dst' is not set.
+ */
+int maxim_charger_calc_reg_current(const struct maxim_charger_current *limits,
+   unsigned int min_ua, unsigned int max_ua, u8 *dst)
+{
+   unsigned int current_bits = 0xf;
+
+   if (min_ua > max_ua)
+   return -EINVAL;
+
+   if (min_ua > limits->max || max_ua < limits->min)
+   return -EINVAL;
+
+   if (max_ua < limits->high_start) {
+   /*
+* Less than high_start, so set the minimal current
+* (turn Low Bit off, 0 as high bits).
+*/
+   *dst = 0x0;
+   return 0;
+   }
+
+   /* max_ua is in range: , cut it to limits.max */
+   max_ua = min(limits->max, max_ua);
+   max_ua -= limits->high_start;
+   /*
+* There is no risk of overflow 'max_ua' here because:
+*  - max_ua >= limits.high_start
+*  - BUILD_BUG checks that 'limits' are: max >= high_start + high_step
+*/
+   current_bits = max_ua / limits->high_step;
+
+   /* Turn Low Bit on (use range ) ... */
+   *dst = 0x1 << CHGCTRL4_MBCICHWRCL_SHIFT;
+   /* and set proper High Bits */
+   *dst |= current_bits << CHGCTRL4_MBCICHWRCH_SHIFT;
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(maxim_charger_calc_reg_current);
+
 static const struct mfd_cell max14577_devs[] = {
{
.name = "max14577-muic",
@@ -463,6 +544,20 @@ static int __init max14577_i2c_init(void)
BUILD_BUG_ON(ARRAY_SIZE(max14577_i2c_id) != MAXIM_DEVICE_TYPE_NUM);
BUILD_BUG_ON(ARRAY_SIZE(max14577_dt_match) != MAXIM_DEVICE_TYPE_NUM);
 
+   /* Valid charger current values must be provided for each chipset */
+   BUILD_BUG_ON(ARRAY_SIZE(maxim_charger_currents) != 
MAXIM_DEVICE_TYPE_NUM);
+
+   /* Check for valid values for charger */
+   BUILD_BUG_ON(MAX14577_CHARGER_CURRENT_LIMIT_HIGH_START +
+   MAX14577_CHARGER_CURRENT_LIMIT_HIGH_STEP *

[RESEND PATCH v3 6/6] Documentation: charger: max14577: Document exported sysfs entry

2014-07-22 Thread Krzysztof Kozlowski

Document the 'fast charge timer' setting exported by max14577 driver
through sysfs entry.

Signed-off-by: Krzysztof Kozlowski 
---
 Documentation/ABI/testing/sysfs-class-power | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-class-power 
b/Documentation/ABI/testing/sysfs-class-power
index 78c7baca3587..83ee67ebf0e9 100644
--- a/Documentation/ABI/testing/sysfs-class-power
+++ b/Documentation/ABI/testing/sysfs-class-power
@@ -18,3 +18,17 @@ Description:
This file is writeable and can be used to set the assumed
battery 'full level'. As batteries age, this value has to be
amended over time.
+
+What:  
/sys/class/power_supply/max14577-charger/device/fast_charge_timer
+Date:  July 2014
+KernelVersion: 3.17.0
+Contact:   Krzysztof Kozlowski 
+Description:
+   This entry shows and sets the maximum time the max14577
+   charger operates in fast-charge mode. When the timer expires
+   the device will terminate fast-charge mode (charging current
+   will drop to 0 A) and will trigger interrupt.
+
+   Valid values:
+   - 5, 6 or 7 (hours),
+   - 0: disabled.
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH v3 5/6] devicetree: mfd: max14577: Add device tree bindings document

2014-07-22 Thread Krzysztof Kozlowski

Add document describing device tree bindings for MAX14577 MFD
drivers: MFD core, extcon, regulator and charger.

Both MAX14577 and MAX77836 chipsets are documented.

Signed-off-by: Krzysztof Kozlowski 
Cc: Kyungmin Park 
Cc: Tomasz Figa 
Cc: devicet...@vger.kernel.org
Cc: Rob Herring 
Cc: Pawel Moll 
Cc: Mark Rutland 
Cc: Ian Campbell 
Cc: Kumar Gala 
Reviewed-by: Tomasz Figa 
---
 Documentation/devicetree/bindings/mfd/max14577.txt | 146 +
 1 file changed, 146 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mfd/max14577.txt

diff --git a/Documentation/devicetree/bindings/mfd/max14577.txt 
b/Documentation/devicetree/bindings/mfd/max14577.txt
new file mode 100644
index ..236264c10b92
--- /dev/null
+++ b/Documentation/devicetree/bindings/mfd/max14577.txt
@@ -0,0 +1,146 @@
+Maxim MAX14577/77836 Multi-Function Device
+
+MAX14577 is a Multi-Function Device with Micro-USB Interface Circuit, Li+
+Battery Charger and SFOUT LDO output for powering USB devices. It is
+interfaced to host controller using I2C.
+
+MAX77836 additionally contains PMIC (with two LDO regulators) and Fuel Gauge.
+
+
+Required properties:
+- compatible : Must be "maxim,max14577" or "maxim,max77836".
+- reg : I2C slave address for the max14577 chip (0x25 for max14577/max77836)
+- interrupts : IRQ line for the chip.
+- interrupt-parent :  The parent interrupt controller.
+
+
+Required nodes:
+ - charger :
+   Node for configuring the charger driver.
+   Required properties:
+   - compatible : "maxim,max14577-charger"
+   or "maxim,max77836-charger"
+   - maxim,fast-charge-uamp : Current in uA for Fast Charge;
+   Valid values:
+   - for max14577: 9 - 95;
+   - for max77836: 45000 - 475000;
+   - maxim,eoc-uamp : Current in uA for End-Of-Charge mode;
+   Valid values:
+   - for max14577: 5 - 20;
+   - for max77836: 5000 - 10;
+   - maxim,ovp-uvolt : OverVoltage Protection Threshold in uV;
+   In an overvoltage condition, INT asserts and charging
+   stops. Valid values:
+   - 600, 650, 700, 750;
+   - maxim,constant-uvolt : Battery Constant Voltage in uV;
+   Valid values:
+   - 400 - 428 (step by 2);
+   - 435;
+
+
+Optional nodes:
+- max14577-muic/max77836-muic :
+   Node used only by extcon consumers.
+   Required properties:
+   - compatible : "maxim,max14577-muic" or "maxim,max77836-muic"
+
+- regulators :
+   Required properties:
+   - compatible : "maxim,max14577-regulator"
+   or "maxim,max77836-regulator"
+
+   May contain a sub-node per regulator from the list below. Each
+   sub-node should contain the constraints and initialization information
+   for that regulator. See regulator.txt for a description of standard
+   properties for these sub-nodes.
+
+   List of valid regulator names:
+   - for max14577: CHARGER, SAFEOUT.
+   - for max77836: CHARGER, SAFEOUT, LDO1, LDO2.
+
+   The SAFEOUT is a fixed voltage regulator so there is no need to specify
+   voltages for it.
+
+
+Example:
+
+#include 
+
+max14577@25 {
+   compatible = "maxim,max14577";
+   reg = <0x25>;
+   interrupt-parent = <&gpx1>;
+   interrupts = <5 IRQ_TYPE_NONE>;
+
+   muic: max14577-muic {
+   compatible = "maxim,max14577-muic";
+   };
+
+   regulators {
+   compatible = "maxim,max14577-regulator";
+
+   SAFEOUT {
+   regulator-name = "SAFEOUT";
+   };
+   CHARGER {
+   regulator-name = "CHARGER";
+   regulator-min-microamp = <9>;
+   regulator-max-microamp = <95>;
+   regulator-boot-on;
+   };
+   };
+
+   charger {
+   compatible = "maxim,max14577-charger";
+
+   maxim,constant-uvolt = <435>;
+   maxim,fast-charge-uamp = <45>;
+   maxim,eoc-uamp = <5>;
+   maxim,ovp-uvolt = <650>;
+   };
+};
+
+
+max77836@25 {
+   compatible = "maxim,max77836";
+   reg = <0x25>;
+   interrupt-parent = <&gpx1>;
+   interrupts = <5 IRQ_TYPE_NONE>;
+
+   muic: max77836-muic {
+   compatible = "maxim,max77836-muic";
+   };
+
+   regulators {
+   compatible = "maxim,max77836-regulator";
+
+   SAFEOUT {
+   regulator-name = "SAFEOUT";
+   };
+   CHARGER {
+   regulator-name = "CHARGER";
+   regulator-min-microamp = <9>;
+

[PATCH RESEND] spin_lock_*(): Always evaluate second argument

2014-07-22 Thread Bart Van Assche

Evaluating a macro argument only if certain configuration options
have been selected is confusing and error-prone. Hence always
evaluate the second argument of spin_lock_nested() and
spin_lock_nest_lock().

An intentional side effect of this patch is that it avoids that
the following warning is reported for netif_addr_lock_nested()
when building with CONFIG_DEBUG_LOCK_ALLOC=n and with W=1:

include/linux/netdevice.h: In function 'netif_addr_lock_nested':
include/linux/netdevice.h:2865:6: warning: variable 'subclass' set but not used 
[-Wunused-but-set-variable]
  int subclass = SINGLE_DEPTH_NESTING;
  ^

Signed-off-by: Bart Van Assche 
Acked-by: David Rientjes 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: David S. Miller 
Cc: Andrew Morton 
---
 include/linux/spinlock.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 3f2867f..32b16cc 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -197,8 +197,10 @@ static inline void do_raw_spin_unlock(raw_spinlock_t 
*lock) __releases(lock)
 _raw_spin_lock_nest_lock(lock, &(nest_lock)->dep_map); \
 } while (0)
 #else
-# define raw_spin_lock_nested(lock, subclass)  _raw_spin_lock(lock)
-# define raw_spin_lock_nest_lock(lock, nest_lock)  _raw_spin_lock(lock)
+# define raw_spin_lock_nested(lock, subclass)  \
+   ((void)(subclass), _raw_spin_lock(lock))
+# define raw_spin_lock_nest_lock(lock, nest_lock)  \
+   ((void)(nest_lock), _raw_spin_lock(lock))
 #endif
 
 #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH v3 0/6] charger/mfd: max14577: Add support for MAX77836

2014-07-22 Thread Krzysztof Kozlowski

Hi,


This is a resend of third version of patches adding support for
MAX77836 device to the max14577 drivers.

This patchset was already reviewed by some of the maintainers during
previous submissions.
I only need acks from power tree (patches: 1, 3, 4, 6).

The patches 1, 2 and 3 depend on each other so they should be
pulled at once. Patches 4, 5 and 6 can be applied independently.
Lee Jones said he can take the set through his tree. Still I need acks
from power subsystem.


Changes since v2

1. charger: Use sysfs instead of DTS for setting the fast charge timer.
   The charger driver now selects the CONFIG_SYSFS and exports
   a DEVICE_ATTR. (suggested by Mark Rutland)
2. Add patch 6 with documentation of exported sysfs entry for fast
   charge timer.
3. charger 3/6: Add missing 'break' in switch parsing valid values
   for fast charge timer.

Changes since v1

1. charger 3/5: Add an error message for each unsuccessful parse of DT
   property (suggested by Mark Rutland).
2. charger 3/5: Use 'u32' type for storing values from DT (suggested
   by Mark Rutland).
3. charger 3/5: Remove an error message for memory allocation failure.


The patchset (first and second part of the MAX77836 drivers) has been
on the lists since January. Changelog for the first part of drivers:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg628696.html


Best regards,
Krzysztof Kozlowski


Krzysztof Kozlowski (6):
  charger: max14577: Add support for MAX77836 charger
  regulator/mfd: max14577: Export symbols for calculating charger
current
  charger: max14577: Configure battery-dependent settings from DTS and
sysfs
  power: max17040: Add ID for MAX77836 Fuel Gauge block
  devicetree: mfd: max14577: Add device tree bindings document
  Documentation: charger: max14577: Document exported sysfs entry

 Documentation/ABI/testing/sysfs-class-power|  14 +
 Documentation/devicetree/bindings/mfd/max14577.txt | 146 
 drivers/mfd/max14577.c | 100 +-
 drivers/power/Kconfig  |   5 +-
 drivers/power/max14577_charger.c   | 370 +++--
 drivers/power/max17040_battery.c   |   1 +
 drivers/regulator/max14577.c   |  80 +
 include/linux/mfd/max14577-private.h   |  95 --
 include/linux/mfd/max14577.h   |  30 ++
 9 files changed, 703 insertions(+), 138 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/mfd/max14577.txt

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/5] net/netfilter/ipvs/ip_vs_ctl.c: drop argument range check just before the check for equality

2014-07-22 Thread Dan Carpenter

On Mon, Jul 21, 2014 at 11:01:56PM +0300, Julian Anastasov wrote:
> @@ -2333,13 +2339,12 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void 
> __user *user, unsigned int len)
>   struct ip_vs_dest_user_kern udest;
>   struct netns_ipvs *ipvs = net_ipvs(net);
>  
> + BUILD_BUG_ON(sizeof(arg) > 256);

256 is off-by-one because u8 ranges from 0-255 so we are never able to
copy 256 bytes into the "arg" buffer.

> - if (copylen > 128)
> + if (*len < (int) copylen || *len < 0) {
> + pr_err("get_ctl: len %d < %u\n", *len, copylen);

Don't let users flood dmesg.  Just return an error.  (This can be
triggered by non-root as well).

>   return -EINVAL;
> + }

regards,
dan carpenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/5] futex: introduce an optimistic spinning futex

2014-07-22 Thread Peter Zijlstra

On Mon, Jul 21, 2014 at 09:34:57PM -0400, Steven Rostedt wrote:

> I just want to point out that I was having a very nice conversation
> with Robert Haas (Cc'd) in Napa Valley at Linux Collaboration about
> this very topic. Robert is a PostgeSQL developer who told me that they
> implement their spin locks completely in userspace (no futex, just raw
> spinning on shared memory). This is because the sleep on contention of a
> futex has shown to be very expensive in their benchmarks. His work is
> not a micro benchmark but for a very popular database where locking is
> crucial.

Userspace spinlocks are a clusterfuck. Its impossible to solve the
priority inversion trainwrecks they cause _ever_.

We've had -- as I think Mike already pointed out -- tons of 'fun' with
psql exactly because its doing this :-(

> I was telling Robert that if futexes get optimistic spinning, he should
> reconsider their use of userspace spinlocks in favor of this, because
> I'm pretty sure that they will see a great improvement.
> 
> Now Robert will be the best one to answer if the system call is indeed
> more expensive than doing full spins in userspace. If the spin is done
> in the kernel and they still get better performance by just spinning
> blindly in userspace even if the owner is asleep, I think we will have
> our answer.

No, the best way is to measure the exact syscall cost. If he still gets
better performance we need to analyze why, there might be something else
hiding there.

> Note, I believe they only care about shared threads, and this
> optimistic spinning does not need to be something done between
> processes.

There's no reason not to provide it for shared futexes, in fact I
suspect not doing it for shared futexes is going to make the code
uglier.

Anyway, there is one big fail in the entire futex stack that we 'need'
to sort some day and that is NUMA. Some people (again database people)
explicitly do not use futexes and instead use sysvsem because of this.

The problem with numa futexes is that because they're vaddr based there
is no (persistent) node information. You always end up having to fall
back to looking in all nodes before you can guarantee there is no
matching futex.

One way to achieve it is by extending the futex value to include a node
number, but that's obviously a complete ABI break. Then again, it should
be pretty straight fwd, since the node number doesn't need to be part of
the actual atomic update part, just part of the userspace storage.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/5] kernel/debug/kdb/kdb_bp.c: drop negativity check on unsigned value

2014-07-22 Thread Dan Carpenter

On Fri, Jul 18, 2014 at 06:34:52PM +0300, Andrey Utkin wrote:
>  static char *kdb_bptype(kdb_bp_t *bp)
>  {
> - if (bp->bp_type < 0 || bp->bp_type > 4)
> + if (bp->bp_type > 4)
>   return "";

With Smatch, I ignore negative checks in this format.  It's obvious what
the intent is and they are harmless.  Patching them requires a little
review to make sure that someone isn't introducing a bug and can't be
done directly in the email client.

On the other hand, in Smatch I do complain about checks like:

if (bp->bp_type > 4 || bp->bp_type < 0)

Because only backwards thinking people write checks like that.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 9/9] usb: musb: musb_am335x: reinstate module loading/unloading support

2014-07-22 Thread Lothar Waßmann

Hi,

Felipe Balbi wrote:
> On Fri, Jul 18, 2014 at 11:31:30AM +0200, Lothar Waßmann wrote:
> > There is no need to throw the baby out with the bath due to a bad
> > failure analysis. The commit:
> > 7adb5c876e9c usb: musb: Fix panic upon musb_am335x module removal
> > came to a wrong conclusion about the cause of the crash it was
> > "fixing". The real culprit was the phy-am335x module that was removed
> > from underneath its users that were still referencing data from it.
> > After having fixed this in a previous patch, module unloading can be
> > reinstated.
> > 
> > Another bug with module loading/unloading was the fact, that after
> > removing the devices instantiated from DT their 'OF_POPULATED' flag
> > was still set, so that re-loading the module after an unload had no
> > effect. This is also fixed in this patch.
> 
> now this is a good commit log. I still can't see the need for the other
> patch adding try_module_get(), though. Another thing, this needs to be
> reviewed by DT folks too.
> 
Without holding a reference to the phy module, the module may be
unloaded when its resources are still in use which may lead to the
crash observed in the above stated commit.

> > Signed-off-by: Lothar Waßmann 
> > ---
> >  drivers/usb/musb/musb_am335x.c |   23 ++-
> >  1 file changed, 18 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/usb/musb/musb_am335x.c b/drivers/usb/musb/musb_am335x.c
> > index 164c868..152a6f5 100644
> > --- a/drivers/usb/musb/musb_am335x.c
> > +++ b/drivers/usb/musb/musb_am335x.c
> > @@ -19,6 +19,22 @@ err:
> > return ret;
> >  }
> >  
> > +static int of_remove_populated_child(struct device *dev, void *d)
> > +{
> > +   struct platform_device *pdev = to_platform_device(dev);
> > +
> > +   of_device_unregister(pdev);
> > +   of_node_clear_flag(pdev->dev.of_node, OF_POPULATED);
> 
> I don't think this should be called by drivers; rather by the bus.
> 
Possibly the right thing to do would be to use of_platform_depopulate()
in the remove function to pair up with op_platform_populate() in the
probe function, but doing this results in a crash in
platform_device_del() (called from platform_device_unregister()) when
release_resource() is being called on resources that were never
properly registered with the device:
Unable to handle kernel NULL pointer dereference at virtual address 0018
pgd = 8dd64000
[0018] *pgd=8ddc2831, *pte=, *ppte=
Internal error: Oops: 17 [#1] PREEMPT ARM
Modules linked in: musb_am335x(-) [last unloaded: snd]
CPU: 0 PID: 1435 Comm: modprobe Not tainted 3.16.0-rc5-next-20140717-dbg+ #13
task: 8da00880 ti: 8dda task.ti: 8dda
PC is at release_resource+0x14/0x7c
LR is at release_resource+0x10/0x7c
pc : [<8003165c>]lr : [<80031658>]psr: a013
sp : 8dda1ec0  ip : 8dda  fp : 
r10:   r9 : 8dda  r8 : 8deb7c10
r7 : 8deb7c00  r6 : 0200  r5 : 0001  r4 : 8deed380
r3 :   r2 :   r1 : 0011  r0 : 806772a0
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 8dd64019  DAC: 0015
Process modprobe (pid: 1435, stack limit = 0x8dda0238)
Stack: (0x8dda1ec0 to 0x8dda2000)
1ec0: 8da00880 8deed380 0001 802f850c  8deed380 44e10620 44e1062f
1ee0: 8deb7c00  80398c5c 0081 8000e904 802f8848 8deb7c10 80398d0c
1f00:  802f3470 8d8d7200 8ddc84b0 8d92e810 7f0f9014 8d92e844 7f0f7010
1f20: 8d92e810 802f8110 802f80f8 802f68f8 7f0f9014 8d92e810 7f0f9014 802f7238
1f40: 7f0f9014  2013 802f6764 7f0f9058 8007ec94  6273756d
1f60: 336d615f 00783533 8da00880 8000e764 0001 80053ae4 0058 76f41000
1f80: 7eb299e8 01290838 0011 0081 6010 e854 7eb299e8 01290838
1fa0: 0011 8000e740 7eb299e8 01290838 012908a0 0080  0001
1fc0: 7eb299e8 01290838 0011 0081 012908a0  01290844 
1fe0: 76eb76f0 7eb2995c a534 76eb76fc 6010 012908a0  
[<8003165c>] (release_resource) from [<802f850c>] 
(platform_device_del+0xb4/0xf4)
[<802f850c>] (platform_device_del) from [<802f8848>] 
(platform_device_unregister+0xc/0x18)
[<802f8848>] (platform_device_unregister) from [<80398d0c>] 
(of_platform_device_destroy+0xb0/0xc8)
[<80398d0c>] (of_platform_device_destroy) from [<802f3470>] 
(device_for_each_child+0x34/0x74)
[<802f3470>] (device_for_each_child) from [<7f0f7010>] 
(am335x_child_remove+0x10/0x24 [musb_am335x])
[<7f0f7010>] (am335x_child_remove [musb_am335x]) from [<802f8110>] 
(platform_drv_remove+0x18/0x1c)
[<802f8110>] (platform_drv_remove) from [<802f68f8>] 
(__device_release_driver+0x70/0xc4)
[<802f68f8>] (__device_release_driver) from [<802f7238>] 
(driver_detach+0xb4/0xb8)
[<802f7238>] (driver_detach) from [<802f6764>] (bus_remove_driver+0x5c/0xa4)
[<802f6764>] (bus_remove_driver) from [<8007ec94>] 
(SyS_delete_module+0x120/0x18c)
[<8007ec94>] (SyS_delete_module) from [<8000e740>] (ret_fast_syscall+0x0/0x48)
Code: e1a04000 e59f0068

Re: [PATCH] sched: fix llc shared map unreleased during cpu hotplug

2014-07-22 Thread Peter Zijlstra

On Tue, Jul 22, 2014 at 03:16:31PM +0800, Wanpeng Li wrote:
> [  220.262093] BUG: unable to handle kernel NULL pointer dereference at 
> 0004
> [  220.262104] IP: [] find_busiest_group+0x2b9/0xa30
> [  220.262111] PGD 5a9d5067 PUD 13067 PMD 0
> [  220.262117] Oops:  [#3] SMP
> [...]
> [  220.262245] Call Trace:
> [  220.262252]  [] load_balance+0x156/0x980
> [  220.262259]  [] ? _raw_spin_unlock_irqrestore+0x2e/0xa0
> [  220.262266]  [] idle_balance+0xe3/0x150
> [  220.262270]  [] __schedule+0x797/0x8d0
> [  220.262277]  [] schedule+0x24/0x70
> [  220.262283]  [] schedule_timeout+0x119/0x1f0
> [  220.262294]  [] ? lock_timer_base+0x70/0x70
> [  220.262301]  [] 
> schedule_timeout_uninterruptible+0x19/0x20
> [  220.262308]  [] msleep+0x18/0x20
> [  220.262317]  [] lock_device_hotplug_sysfs+0x2a/0x50
> [  220.262323]  [] online_store+0x2e/0x80
> [  220.262358]  [] dev_attr_store+0x1b/0x20
> [  220.262366]  [] sysfs_write_file+0xdd/0x160
> [  220.262377]  [] vfs_write+0xc8/0x170
> [  220.262384]  [] SyS_write+0x5a/0xa0
> [  220.262388]  [] system_call_fastpath+0x16/0x1b
> 
> Last level cache shared map is built during cpu up and build sched domain 
> routine takes advantage of it to setup sched domain cpu topology, however, 
> llc shared map is unreleased during cpu disable which lead to invalid sched 
> domain cpu topology. This patch fix it by release llc shared map correctly
> during cpu disable.
> 
> Signed-off-by: Wanpeng Li 
> ---
>  arch/x86/kernel/smpboot.c | 3 +++
>  1 file changed, 3 insertions(+)

While the scheduler uses this information, the code you're patching is
very much not scheduler code, therefore your subject line is entirely
wrong.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] perf tools: Fix incorrect fd error comparison

2014-07-22 Thread Jiri Olsa

On Thu, Jul 17, 2014 at 11:43:09AM +0300, Adrian Hunter wrote:
> Zero is a valid fd.  Error comparison should check
> for negative fd.
> 
> Signed-off-by: Adrian Hunter 

Acked-by: Jiri Olsa 

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kernel/debug/kdb/kdb_bp.c: fix argument range check

2014-07-22 Thread Dan Carpenter

On Fri, Jul 18, 2014 at 08:40:24PM +0300, Andrey Utkin wrote:
> Dropped negativity check; enhanced upper limit check as proposed by
> Walter Harms 
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=80591
> Reported-by: David Binderman 
> Signed-off-by: Andrey Utkin 
> ---
>  kernel/debug/kdb/kdb_bp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/debug/kdb/kdb_bp.c b/kernel/debug/kdb/kdb_bp.c
> index 70a5046..371150f 100644
> --- a/kernel/debug/kdb/kdb_bp.c
> +++ b/kernel/debug/kdb/kdb_bp.c
> @@ -39,7 +39,7 @@ static char *kdb_rwtypes[] = {
>  
>  static char *kdb_bptype(kdb_bp_t *bp)
>  {
> - if (bp->bp_type < 0 || bp->bp_type > 4)
> + if (bp->bp_type >= ARRAY_SIZE(kdb_rwtypes))
>   return "";

I like this version of the patch because it silences the static checker
warning like the first one does but it also makes the code more
readable.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3] perf tools: Record whether a dso has data

2014-07-22 Thread Jiri Olsa

On Thu, Jul 17, 2014 at 11:58:30AM +0300, Adrian Hunter wrote:
> Add 'data_status' to record whether a dso has data
> (i.e. an object file)

I might have seen it in your last patsent, but forgot.. what is
this data_status going to be used for?

SNIP

> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index c239e86..07d0a58 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -40,6 +40,12 @@ enum dso_swap_type {
>   DSO_SWAP__YES,
>  };
>  
> +enum dso_data_status {
> + DSO_DATA_STATUS_ERROR   = -1,
> + DSO_DATA_STATUS_UNKNOWN = 0,
> + DSO_DATA_STATUS_OK  = 1,
> +};
> +
>  #define DSO__SWAP(dso, type, val)\
>  ({   \
>   type r = val;   \
> @@ -104,6 +110,7 @@ struct dso {
>   struct {
>   struct rb_root   cache;
>   int  fd;
> + int  data_status;

also please call it just 'status' it's already in 'data' struct

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86, TSC: Add a software TSC offset

2014-07-22 Thread Peter Zijlstra

On Mon, Jul 21, 2014 at 02:56:49PM -0700, Andy Lutomirski wrote:
> > Remember, this is only attempting to be a hardware workaround for a
> > smallish number of systems out there. Most of current machines should
> > have stable and synched TSCs.
> 
> I actually own one of these systems.  It's a Sandy Bridge Core-i7
> Extreme or something like that.

If it is (and it sounds like it is) a single socket, then your unsynced
TSC is likely due to SMM fuckery and the TSCs will drift further and
further apart as (run)time increases due to SMM activity.

The problem Borislav is talking about is multi socket systems, which due
to (failed) board layout get the CPUs powered up 'wrong' and the TSCs
between sockets is offset because of this, its a fixed offset and stable
forever after (until power cycle etc..).

I have one WSM-EP that does this (occasionally).

His initial idea was to re-write the TSC value to match, but since
writing the TSC is expensive (in the 1000s of cycles range) getting an
offset adjustment of 10s of cycles in just right is nigh on impossible.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] mfd: imanager2: Add defines support for IT8516/18/28

2014-07-22 Thread Lee Jones

This patch needs a commit log.  In fact, it can be squashed into the
first commit which uses these defines.

> Signed-off-by: Wei-Chun Pan 
> ---
>  include/linux/mfd/imanager2_ec.h | 358 
> +++
>  1 file changed, 358 insertions(+)
>  create mode 100644 include/linux/mfd/imanager2_ec.h
> 
> diff --git a/include/linux/mfd/imanager2_ec.h 
> b/include/linux/mfd/imanager2_ec.h
> new file mode 100644
> index 000..bf7d70e
> --- /dev/null
> +++ b/include/linux/mfd/imanager2_ec.h
> @@ -0,0 +1,358 @@
> +/*
> + * imanager2_ec.h - MFD driver defines of Advantech EC IT8516/18/28
> + * Copyright (C) 2014  Richard Vidal-Dorsch 
> + *
> + * This program is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see .

This is the long version of the licence.  Can you find the shorter one.

> + */
> +
> +#ifndef __IMANAGER2_EC_H__
> +#define __IMANAGER2_EC_H__
> +
> +#include 
> +
> +#define EC_FLAG_IO   0
> +#define EC_FLAG_IO_MAILBOX   (1 << 0)
> +#define EC_FLAG_MAILBOX  (1 << 1)

Use BIT(0), BIT(1) instead.

> +#define EC_MAX_DEVICE_ID_NUM 0xFF
> +#define EC_MAX_ITEM_NUM  32
> +
> +struct ec_table {
> + u8 devid2itemnum[EC_MAX_DEVICE_ID_NUM];
> + u8 pinnum[EC_MAX_ITEM_NUM];
> + u8 devid[EC_MAX_ITEM_NUM];
> +};
> +
> +struct ec_version {
> + u16 kernel_ver,
> + chip_code,
> + prj_id,
> + prj_ver;

Declare these separately.

> +};
> +
> +#define EC_MAX_LEN_PROJECT_NAME  8

Find a way to do this dynamically, or use 'const char *' instead.

> +struct imanager2 {
> + u16 id;
> + u32 flag;
> + struct mutex lock;  /* protects io */

  We know what locks do.

> + char prj_name[EC_MAX_LEN_PROJECT_NAME + 1]; /* strlen + '\0' */

  We know how strings work. 

> + struct ec_version version;
> + struct ec_table table;

'table' is awfully generic.

> +};

Use KernelDoc to document the main container.

> +/*
> + * Definition
> + */

This comment is not adding anything to the code here.

> +#define EC_TABLE_ITEM_UNUSED 0xFF
> +#define EC_TABLE_DID_NODEV   0x00
> +#define EC_TABLE_HWP_NODEV   0xFF
> +#define EC_TABLE_NOITEM  0xFF
> +
> +#define EC_ERROR 0xFF
> +
> +#define EC_RAM_BANK_SIZE 32  /* 32 bytes size for each bank. */
> +#define EC_RAM_BUFFER_SIZE   256 /* 32 bytes * 8 banks = 256 bytes */
> +
> +#define EC_SIO_CMD   0x29C
> +#define EC_SIO_DATA  0x29D

12bit commands?  Or are these 16bit and should have a leading 0?

> +/* Access Mailbox */
> +#define EC_IO_PORT_CMD   0x29A
> +#define EC_IO_PORT_DATA  0x299
> +
> +#define EC_IO_CMD_READ_OFFSET0xA0
> +#define EC_IO_CMD_WRITE_OFFSET   0x50

Should these have leading 0's too?

> +#define EC_ITE_PORT_OFS  0x29E
> +#define EC_ITE_PORT_DATA 0x29F
> +
> +/*
> + * CMD - IO
> + */

Drop this.

> +/* ADC */
> +#define EC_CMD_ADC_INDEX 0x15
> +#define EC_CMD_ADC_READ_LSB  0x16
> +#define EC_CMD_ADC_READ_MSB  0x1F
> +/* HW Control Table */
> +#define EC_CMD_HWCTRLTABLE_INDEX 0x20
> +#define EC_CMD_HWCTRLTABLE_GET_PIN_NUM   0x21
> +#define EC_CMD_HWCTRLTABLE_GET_DEVICE_ID 0x22
> +#define EC_CMD_HWCTRLTABLE_GET_PIN_ACTIVE_POLARITY   0x23
> +/* ACPI RAM */
> +#define EC_CMD_ACPIRAM_READ  0x80
> +#define EC_CMD_ACPIRAM_WRITE 0x81
> +/* Extend RAM */
> +#define EC_CMD_EXTRAM_READ   0x86
> +#define EC_CMD_EXTRAM_WRITE  0x87
> +/* HW RAM */
> +#define EC_CMD_HWRAM_READ0x88
> +#define EC_CMD_HWRAM_WRITE   0x89
> +
> +/*
> + * ACPI RAM Address Table
> + */
> +/* n = 1 ~ 2 */

Please accompany with words.  Random math is seldom helpful.

I'm guessing that you mean valid values of 'n' can be 1 or 2, but this
is unclear.

> +#define EC_ACPIRAM_ADDR_TEMPERATURE_BASE(n)  (0x60 + 3 * ((n) - 1))

What is 0x60?  Why 3?  Please use #defines for these numbers.

> +#define  EC_ACPIRAM_ADDR_LOCAL_TEMPERATURE(n) \
> + EC_ACPIRAM_ADDR_TEMPERATURE_BASE(n)
> +#define  EC_ACPIRAM_A

Re: [PATCH] x86, TSC: Add a software TSC offset

2014-07-22 Thread Peter Zijlstra

On Mon, Jul 21, 2014 at 03:13:33PM -0700, Andy Lutomirski wrote:
> >> I actually own one of these systems.  It's a Sandy Bridge Core-i7
> >> Extreme or something like that.
> >
> > Ha, cool, so I've got my tester! :-)
> 
> Ha.  Ha ha.  Muahaha.  Because IIRC this box is synced until the first
> time it suspends.

Oh cute, one of those :-) Yes some BIOSes write 'random' crap into the
TSC when doing the S states jig. We have some tsc suspend/resume hooks
to correct the worst of that, but yeah screwy that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] x86, hotplug: fix llc shared map unreleased during cpu hotplug

2014-07-22 Thread Wanpeng Li

[  220.262093] BUG: unable to handle kernel NULL pointer dereference at 
0004
[  220.262104] IP: [] find_busiest_group+0x2b9/0xa30
[  220.262111] PGD 5a9d5067 PUD 13067 PMD 0
[  220.262117] Oops:  [#3] SMP
[...]
[  220.262245] Call Trace:
[  220.262252]  [] load_balance+0x156/0x980
[  220.262259]  [] ? _raw_spin_unlock_irqrestore+0x2e/0xa0
[  220.262266]  [] idle_balance+0xe3/0x150
[  220.262270]  [] __schedule+0x797/0x8d0
[  220.262277]  [] schedule+0x24/0x70
[  220.262283]  [] schedule_timeout+0x119/0x1f0
[  220.262294]  [] ? lock_timer_base+0x70/0x70
[  220.262301]  [] schedule_timeout_uninterruptible+0x19/0x20
[  220.262308]  [] msleep+0x18/0x20
[  220.262317]  [] lock_device_hotplug_sysfs+0x2a/0x50
[  220.262323]  [] online_store+0x2e/0x80
[  220.262358]  [] dev_attr_store+0x1b/0x20
[  220.262366]  [] sysfs_write_file+0xdd/0x160
[  220.262377]  [] vfs_write+0xc8/0x170
[  220.262384]  [] SyS_write+0x5a/0xa0
[  220.262388]  [] system_call_fastpath+0x16/0x1b

Last level cache shared map is built during cpu up and build sched domain 
routine takes advantage of it to setup sched domain cpu topology, however, 
llc shared map is unreleased during cpu disable which lead to invalid sched 
domain cpu topology. This patch fix it by release llc shared map correctly
during cpu disable.

Signed-off-by: Wanpeng Li 
---
v1 -> v2:
 * fix subject line

 arch/x86/kernel/smpboot.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 5492798..0134ec7 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1292,6 +1292,9 @@ static void remove_siblinginfo(int cpu)
 
for_each_cpu(sibling, cpu_sibling_mask(cpu))
cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
+   for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
+   cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling));
+   cpumask_clear(cpu_llc_shared_mask(cpu));
cpumask_clear(cpu_sibling_mask(cpu));
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-22 Thread Oded Gabbay


On 22/07/14 02:05, Jerome Glisse wrote:

On Tue, Jul 22, 2014 at 12:56:13AM +0300, Oded Gabbay wrote:

On 21/07/14 22:28, Jerome Glisse wrote:

On Mon, Jul 21, 2014 at 10:23:43PM +0300, Oded Gabbay wrote:

On 21/07/14 21:59, Jerome Glisse wrote:

On Mon, Jul 21, 2014 at 09:36:44PM +0300, Oded Gabbay wrote:

On 21/07/14 21:14, Jerome Glisse wrote:

On Mon, Jul 21, 2014 at 08:42:58PM +0300, Oded Gabbay wrote:

On 21/07/14 18:54, Jerome Glisse wrote:

On Mon, Jul 21, 2014 at 05:12:06PM +0300, Oded Gabbay wrote:

On 21/07/14 16:39, Christian König wrote:

Am 21.07.2014 14:36, schrieb Oded Gabbay:

On 20/07/14 20:46, Jerome Glisse wrote:

On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote:

Forgot to cc mailing list on cover letter. Sorry.

As a continuation to the existing discussion, here is a v2 patch series
restructured with a cleaner history and no totally-different-early-versions
of the code.

Instead of 83 patches, there are now a total of 25 patches, where 5 of them
are modifications to radeon driver and 18 of them include only amdkfd code.
There is no code going away or even modified between patches, only added.

The driver was renamed from radeon_kfd to amdkfd and moved to reside under
drm/radeon/amdkfd. This move was done to emphasize the fact that this driver
is an AMD-only driver at this point. Having said that, we do foresee a
generic hsa framework being implemented in the future and in that case, we
will adjust amdkfd to work within that framework.

As the amdkfd driver should support multiple AMD gfx drivers, we want to
keep it as a seperate driver from radeon. Therefore, the amdkfd code is
contained in its own folder. The amdkfd folder was put under the radeon
folder because the only AMD gfx driver in the Linux kernel at this point
is the radeon driver. Having said that, we will probably need to move it
(maybe to be directly under drm) after we integrate with additional AMD gfx
drivers.

For people who like to review using git, the v2 patch set is located at:
http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2

Written by Oded Gabbayh 


So quick comments before i finish going over all patches. There is many
things that need more documentation espacialy as of right now there is
no userspace i can go look at.

So quick comments on some of your questions but first of all, thanks for the
time you dedicated to review the code.


There few show stopper, biggest one is gpu memory pinning this is a big
no, that would need serious arguments for any hope of convincing me on
that side.

We only do gpu memory pinning for kernel objects. There are no userspace
objects that are pinned on the gpu memory in our driver. If that is the case,
is it still a show stopper ?

The kernel objects are:
- pipelines (4 per device)
- mqd per hiq (only 1 per device)
- mqd per userspace queue. On KV, we support up to 1K queues per process, for
a total of 512K queues. Each mqd is 151 bytes, but the allocation is done in
256 alignment. So total *possible* memory is 128MB
- kernel queue (only 1 per device)
- fence address for kernel queue
- runlists for the CP (1 or 2 per device)


The main questions here are if it's avoid able to pin down the memory and if the
memory is pinned down at driver load, by request from userspace or by anything
else.

As far as I can see only the "mqd per userspace queue" might be a bit
questionable, everything else sounds reasonable.

Christian.


Most of the pin downs are done on device initialization.
The "mqd per userspace" is done per userspace queue creation. However, as I
said, it has an upper limit of 128MB on KV, and considering the 2G local
memory, I think it is OK.
The runlists are also done on userspace queue creation/deletion, but we only
have 1 or 2 runlists per device, so it is not that bad.


2G local memory ? You can not assume anything on userside configuration some
one might build an hsa computer with 512M and still expect a functioning
desktop.

First of all, I'm only considering Kaveri computer, not "hsa" computer.
Second, I would imagine we can build some protection around it, like
checking total local memory and limit number of queues based on some
percentage of that total local memory. So, if someone will have only
512M, he will be able to open less queues.




I need to go look into what all this mqd is for, what it does and what it is
about. But pinning is really bad and this is an issue with userspace command
scheduling an issue that obviously AMD fails to take into account in design
phase.

Maybe, but that is the H/W design non-the-less. We can't very well
change the H/W.


You can not change the hardware but it is not an excuse to allow bad design to
sneak in software to work around that. So i would rather penalize bad hardware
design and have command submission in the kernel, until AMD fix its hardware to
allow proper scheduling by the kernel and proper control by the kernel.

I'm sorry but I do *not* think this is a bad design. S/W scheduling i

Re: [PATCH 8/9] usb: phy: am335x: call usb_gen_phy_init()/usb_gen_phy_shutdown() in am335x_init()/am335x_shutdown()

2014-07-22 Thread Lothar Waßmann

Hi,

Felipe Balbi wrote:
> Hi,,
> 
> On Mon, Jul 21, 2014 at 10:03:07AM +0200, Lothar Waßmann wrote:
> > Hi,
> > 
> > > On Fri, Jul 18, 2014 at 11:31:29AM +0200, Lothar Waßmann wrote:
> > > > This patch makes it possible to use the musb driver with HW that
> > > > requires external regulators or clocks.
> > > 
> > > can you provide an example of such HW ? Are you not using the internal
> > > PHYs ?
> > > 
> > The Ka-Ro electronics TX48 module uses the mmc0_clk pin as VBUSEN
> > rathern than usb0_drvvbus. This patch makes it possible to use an
> > external regulator to handle the VBUS switch through the 'vcc-supply'
> > property of the underlying generic_phy device.
> 
> OK, I get it now. But why would not use usb0_drvvbus ? You could still
> route usb0_drvvbus to the regulator enable pin and the regulator would
> be enabled for you once correct values are written to the IP's mailbox.
> 
> I suppose this has something to do with layout constraints ?
> 
Yes. The usb0_drvvbus is used for a different purpose.


Lothar Waßmann
-- 
___

Ka-Ro electronics GmbH | Pascalstraße 22 | D - 52076 Aachen
Phone: +49 2408 1402-0 | Fax: +49 2408 1402-10
Geschäftsführer: Matthias Kaussen
Handelsregistereintrag: Amtsgericht Aachen, HRB 4996

www.karo-electronics.de | i...@karo-electronics.de
___
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] shmem: fix faulting into a hole while it's punched, take 3

2014-07-22 Thread Hugh Dickins

On Mon, 21 Jul 2014, Sasha Levin wrote:
> On 07/19/2014 07:44 PM, Hugh Dickins wrote:
> >> Otherwise, I've been unable to reproduce the shmem_fallocate hang.
> > Great.  Andrew, I think we can say that it's now safe to send
> > 1/2 shmem: fix faulting into a hole, not taking i_mutex
> > 2/2 shmem: fix splicing from a hole while it's punched
> > on to Linus whenever suits you.
> > 
> > (You have some other patches in the mainline-later section of the
> > mmotm/series file: they're okay too, but not in doubt as these two were.)
> 
> I think we may need to hold off on sending them...
> 
> It seems that this code in shmem_fault():
> 
>   /*
>* shmem_falloc_waitq points into the shmem_fallocate()
>* stack of the hole-punching task: shmem_falloc_waitq
>* is usually invalid by the time we reach here, but
>* finish_wait() does not dereference it in that case;
>* though i_lock needed lest racing with wake_up_all().
>*/
>   spin_lock(&inode->i_lock);
>   finish_wait(shmem_falloc_waitq, &shmem_fault_wait);
>   spin_unlock(&inode->i_lock);
> 
> Is problematic. I'm not sure what changed, but it seems to be causing 
> everything
> from NULL ptr derefs:
> 
> [  169.922536] BUG: unable to handle kernel NULL pointer dereference at 
> 0631
> 
> To memory corruptions:
> 
> [ 1031.264226] BUG: spinlock bad magic on CPU#1, trinity-c99/25740
> 
> And hangs:
> 
> [  212.010020] INFO: rcu_preempt detected stalls on CPUs/tasks:

Ugh.

I'm so tired of this, I'm flailing around here, and could use some help.

But there is one easy change which might do it: please would you try
changing the TASK_KILLABLE a few lines above to TASK_UNINTERRUPTIBLE.

I noticed when deciding on the i_lock'ing there, that a lot of the
difficulty in races between the two ends, came from allowing KILLABLE
at the faulting end.  Which was a nice courtesy, but one I'll gladly
give up on now, if it is responsible for these troubles.

Please give it a try, I don't know what else to suggest.  And I've
no idea why the problem should emerge only now.  If this change
does appear to fix it, please go back and forth with and without,
to gather confidence that the problem is still reproducible without
the fix.  If fix it turns out to be - touch wood.

Thanks,
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/7] KVM: x86: Function for determining exception type

2014-07-22 Thread Paolo Bonzini

Il 21/07/2014 23:30, Nadav Amit ha scritto:
> Few comments to see we are on the same page:
> 
> On 7/21/14, 3:18 PM, Paolo Bonzini wrote:
>> Il 21/07/2014 13:37, Nadav Amit ha scritto:
>>> +int kvm_exception_type(unsigned int nr)
>>
>> The manual calls this the exception class.
> Yes, but it also calls it exception "type" (see table 6-1
> "Protected-Mode Exceptions and Interrupts" on the SDM).
> I called it exception type, since there is a function exception_class
> that is used to handle nested exceptions.

Ok.

>>> +case VE_VECTOR:
>>> +return EXCPT_FAULT;
>>> +case DB_VECTOR:
>>> +return EXCPT_FAULT_OR_TRAP;
>>
>> It is only a fault for instruction fetch breakpoints.  You can modify
>> kvm_vcpu_check_breakpoint to set RF, add a comment here that fault
>> handling is done elsewhere, and return EXCPT_TRAP.
> Unless I am mistaken, kvm_vcpu_check_breakpoint checks only for
> instruction breakpoint. Since instruction breakpoint should not cause RF
> to be set, this function should not be changed.

You're right ("For any fault-class exception except a debug exception
generated in response to an instruction breakpoint, the
value pushed for RF is 1").  It's the debug exception handler that has
to set RF (and then iretd/iretq will modify RF).

> Anyhow, I would return EXCPT_TRAP on DB_VECTOR.

Yeah, just add a comment that it can be a fault for instruction fetch
breakpoints, but we treat it as a trap for the purposes of (not) setting RF.

Alternatively, just rename the function to exception_sets_rf and return
true/false.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ZAC target (Was: Re: dm-multipath: Accept failed paths for multipath maps)

2014-07-22 Thread Matias Bjorling


On 07/22/2014 07:46 AM, Hannes Reinecke wrote:

On 07/21/2014 09:28 PM, Kent Overstreet wrote:

On Mon, Jul 21, 2014 at 04:23:41PM +0200, Hannes Reinecke wrote:

On 07/18/2014 07:04 PM, John Utz wrote:

On 07/18/2014 05:31 AM, John Utz wrote:

Thankyou very much for the exhaustive answer! I forwarded on to my
project peers because i don't think any of us where aware of the
existing infrastructure.

Of course, said infrastructure would have to be taught about ZAC,
but it seems like it would be a nice place to start testing from


ZAC is a different beast altogether; I've posted an initial set of
patches a while back on linux-scsi.
But I don't think multipath needs to be changed for that.
Other areas of device-mapper most certainly do.


Pretty sure John is working on a new ZAC-oriented DM target.

YUP.

Per Ted T'so's suggestion several months ago, the goal is to create
a new DM target that implements the ZAC/ZBC command set and the SMR
write pointer architecture so that FSfolksen can try their hand at
porting their stuff to it.

It's in the very early stages so there is nothing to show yet, but
development is ongoing. There are a few unknowns about how to surface
some specific behaviors (new verbs and errors, particularly errors
with sense codes that return a write pointer) but i have not gotten
far enuf along in development to be able to construct succint and
specific questions on the topic so that will have to wait for a bit.


I was pondering the 'best' ZAC implementation, too, and found the
'report zones' command _very_ cumbersome to use.
Especially the fact that in theory each zone could have a different size
_and_ plenty of zones could be present will be making zone lookup
hellish.

However: it seems to me that we might benefit from a generic
'block boundaries' implementation.
Reasoning here is that several subsystems (RAID, ZAC/ZBC, and things
like
referrals) impose I/O scheduling boundaries which must not be crossed
when
assembling requests.


Wasn't Ted working on such a thing?


Seeing that we already have some block limitations I was wondering if we
couldn't have some set of 'I/O scheduling boundaries' as part
of the request_queue structure.


I'd prefer not to dump yet more crap in request_queue, but that's a
fairly minor
quibble :)

I also tend to think having different size zones is crazy and I would
avoid
making any effort to support that in practice, but OTOH there's good
reason for
wanting one or two "normal" zones and the rest append only so the
interface is
going to have to accomadate some differences between zones.

Also, depending on the approach supporting different size zones might not
actually be problematic. If you're starting with something that's pure
COW and
you're just plugging in this "ZAC allocation" stuff (which I think is
what I'm
going to do in bcache) then it might not actually be an issue.


No, what I was suggesting is to introduce 'I/O scheduling barriers'.
Some devices like RAID or indeed ZAC have internal boundaries which
cannot be crossed by any I/O. So either the I/O has to be split up or
the I/O scheduler have to be made aware of these boundaries.

I have had this issue several times now (once with implementing
Referrals, now with ZAC) so I was wondering whether we can have some
sort of generic implementation in the block layer.


Would it make any sense to put the hole block allocation strategy within 
the block layer (or just on top of the device driver) and then let 
MDs/FSs users hook into this for allocating new blocks?


That allows multiple implementations to use the same block address space.



And as we're already having request queue limits this might fall quite
naturally into it. Or so I thought.

Hmm. Guess I should start coding here.

Cheers,

Hannes


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 3/9] perf, x86: use the PEBS auto reload mechanism when possible

2014-07-22 Thread Yan, Zheng

When a fixed period is specified, this patch make perf use the PEBS
auto reload mechanism. This makes normal profiling faster, because
it avoids one costly MSR write in the PMI handler.

Signef-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event.c  | 15 +--
 arch/x86/kernel/cpu/perf_event.h  |  1 +
 arch/x86/kernel/cpu/perf_event_intel_ds.c |  9 +
 3 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 8868e9b..60593bc 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -979,13 +979,16 @@ int x86_perf_event_set_period(struct perf_event *event)
 
per_cpu(pmc_prev_left[idx], smp_processor_id()) = left;
 
-   /*
-* The hw event starts counting from this event offset,
-* mark it to be able to extra future deltas:
-*/
-   local64_set(&hwc->prev_count, (u64)-left);
+   if (!(hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) ||
+   local64_read(&hwc->prev_count) != (u64)-left) {
+   /*
+* The hw event starts counting from this event offset,
+* mark it to be able to extra future deltas:
+*/
+   local64_set(&hwc->prev_count, (u64)-left);
 
-   wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
+   wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
+   }
 
/*
 * Due to erratum on certan cpu we need
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index d8165f3..fa8dfd4 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -69,6 +69,7 @@ struct event_constraint {
 #define PERF_X86_EVENT_PEBS_ST 0x2 /* st data address sampling */
 #define PERF_X86_EVENT_PEBS_ST_HSW 0x4 /* haswell style st data sampling */
 #define PERF_X86_EVENT_COMMITTED   0x8 /* event passed commit_txn */
+#define PERF_X86_EVENT_AUTO_RELOAD 0x10 /* use PEBS auto-reload */
 
 struct amd_nb {
int nb_id;  /* NorthBridge id */
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 980970c..ab91b11 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -714,6 +714,8 @@ void intel_pmu_pebs_enable(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
 
hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
+   if (!event->attr.freq)
+   hwc->flags |= PERF_X86_EVENT_AUTO_RELOAD;
 
cpuc->pebs_enabled |= 1ULL << hwc->idx;
 
@@ -721,6 +723,12 @@ void intel_pmu_pebs_enable(struct perf_event *event)
cpuc->pebs_enabled |= 1ULL << (hwc->idx + 32);
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
cpuc->pebs_enabled |= 1ULL << 63;
+
+   /* Use auto-reload if possible to save a MSR write in the PMI */
+   if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) {
+   ds->pebs_event_reset[hwc->idx] =
+   (u64)-hwc->sample_period & x86_pmu.cntval_mask;
+   }
 }
 
 void intel_pmu_pebs_disable(struct perf_event *event)
@@ -739,6 +747,7 @@ void intel_pmu_pebs_disable(struct perf_event *event)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
 
hwc->config |= ARCH_PERFMON_EVENTSEL_INT;
+   hwc->flags &= ~PERF_X86_EVENT_AUTO_RELOAD;
 }
 
 void intel_pmu_pebs_enable_all(void)
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] ipv4: Make IP_MULTICAST_ALL and IP_MSFILTER work on raw sockets

2014-07-22 Thread Quentin Armitage

Currently, although IP_MULTICAST_ALL and IP_MSFILTER ioctl calls succeed on
raw sockets, there is no code to implement the functionality on received
packets; it is only implemented for UDP sockets. The raw(7) man page states:
"In addition, all ip(7) IPPROTO_IP socket options valid for datagram sockets
are supported", which implies these ioctls should work on raw sockets.

To fix this, add a call to ip_mc_sf_allow on raw sockets.

This should not break any existing code, since the current position of
not calling ip_mc_sf_filter makes it behave as if neither the IP_MULTICAST_ALL
nor the IP_MSFILTER ioctl had been called. Adding the call to ip_mc_sf_allow
will therefore maintain the current behaviour so long as IP_MULTICAST_ALL and
IP_MSFILTER ioctls are not called. Any code that currently is calling
IP_MULTICAST_ALL or IP_MSFILTER ioctls on raw sockets presumably is wanting
the filter to be applied, although no filtering will currently be occurring.

Signed-off-by: Quentin Armitage 
---
v2: * Fixed subject line
---
 net/ipv4/raw.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 2c65160..2e1628c 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -58,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -174,7 +175,9 @@ static int raw_v4_input(struct sk_buff *skb, const struct 
iphdr *iph, int hash)
 
while (sk) {
delivered = 1;
-   if (iph->protocol != IPPROTO_ICMP || !icmp_filter(sk, skb)) {
+   if ((iph->protocol != IPPROTO_ICMP || !icmp_filter(sk, skb)) &&
+   ip_mc_sf_allow(sk, iph->daddr, iph->saddr,
+  skb->dev->ifindex)) {
struct sk_buff *clone = skb_clone(skb, GFP_ATOMIC);
 
/* Not releasing hash table! */
-- 
1.7.7.6


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: drbd_worker.c: Remove lines

2014-07-22 Thread Mike Galbraith

On Tue, 2014-07-22 at 02:17 -0400, Nick Krause wrote: 
> There seems to be two fix mes in function in wait_for_work. I was
> wondering can we remove these spinlocks?

Rather than ask someone else this question, take a look to see what the
lock protects and from whom.  If protected thingy cannot possibly be
diddled concurrently, you'll know what to do, if it can be, you'll know
what to do.  If you can't figure it out, move on to the next windmill
until the last one on the planet has been tilted or bounced off of :)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RE: [PATCH] lib : lz4 using put_unaligned_le16 instead of put_unaligned

2014-07-22 Thread Eunbong Song


Hello. 

>Can you check if this patch fix your problem also?
Unfortunately your patch does not fix my problem. My test logs are as follow.
I did the same test with lzo compression algorithm and with my patch, and 
these work well.

debug_shell:/user> echo 1 > /sys/block/zram0/reset
debug_shell:/user> echo lz4 > /sys/block/zram0/comp_algorithm
debug_shell:/user> echo 520M > /sys/block/zram0/disksize
debug_shell:/user> mkfs.ext4 /dev/zram0
mke2fs 1.41.4 (27-Jan-2009)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
33280 inodes, 133120 blocks
6656 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=138412032
5 block groups
32768 blocks per group, 32768 fragments per group
6656 inodes per group
Superblock backups stored on blocks:
32768, 98304

Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 25 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
debug_shell:/user> mount /dev/zram0 /mnt/
EXT4-fs (zram0): unsupported inode size: 272
mount: wrong fs type, bad option, bad superblock on /dev/zram0,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so



Thanks

[PATCH v3 8/9] perf, x86: enlarge PEBS buffer

2014-07-22 Thread Yan, Zheng

Currently the PEBS buffer size is 4k, it only can hold about 21
PEBS records. This patch enlarges the PEBS buffer size to 64k
(the same as BTS buffer), 64k memory can hold about 330 PEBS
records. This will significantly the reduce number of PMI when
large PEBS interrupt threshold is used.

Signed-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 99b07de0..33b4c0e 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -11,7 +11,7 @@
 #define BTS_RECORD_SIZE24
 
 #define BTS_BUFFER_SIZE(PAGE_SIZE << 4)
-#define PEBS_BUFFER_SIZE   PAGE_SIZE
+#define PEBS_BUFFER_SIZE   (PAGE_SIZE << 4)
 #define PEBS_FIXUP_SIZEPAGE_SIZE
 
 /*
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-22 Thread Oded Gabbay


On 22/07/14 10:23, Daniel Vetter wrote:

On Mon, Jul 21, 2014 at 10:23:43PM +0300, Oded Gabbay wrote:

But Jerome, the core problem still remains in effect, even with your
suggestion. If an application, either via userspace queue or via ioctl,
submits a long-running kernel, than the CPU in general can't stop the
GPU from running it. And if that kernel does while(1); than that's it,
game's over, and no matter how you submitted the work. So I don't really
see the big advantage in your proposal. Only in CZ we can stop this wave
(by CP H/W scheduling only). What are you saying is basically I won't
allow people to use compute on Linux KV system because it _may_ get the
system stuck.

So even if I really wanted to, and I may agree with you theoretically on
that, I can't fulfill your desire to make the "kernel being able to
preempt at any time and be able to decrease or increase user queue
priority so overall kernel is in charge of resources management and it
can handle rogue client in proper fashion". Not in KV, and I guess not
in CZ as well.


At least on intel the execlist stuff which is used for preemption can be
used by both the cpu and the firmware scheduler. So we can actually
preempt when doing cpu scheduling.

It sounds like current amd hw doesn't have any preemption at all. And
without preemption I don't think we should ever consider to allow
userspace to directly submit stuff to the hw and overload. Imo the kernel
_must_ sit in between and reject clients that don't behave. Of course you
can only ever react (worst case with a gpu reset, there's code floating
around for that on intel-gfx), but at least you can do something.

If userspace has a direct submit path to the hw then this gets really
tricky, if not impossible.
-Daniel



Hi Daniel,
See the email I just sent to Jerome regarding preemption. Bottom line, in KV, we 
can preempt running queues, except from the case of a stuck gpu kernel. In CZ, 
this was solved.


So, in this regard, I don't think there is any difference between userspace 
queues and ioctl.


Oded
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 6/9] perf, x86: handle multiple records in PEBS buffer

2014-07-22 Thread Yan, Zheng

When PEBS interrupt threshold is larger than one, the PEBS buffer
may include mutiple records for each PEBS event. This patch makes
the code first count how many records each PEBS event has, then
output the samples in batch.

One corner case needs to mention is that the PEBS hardware doesn't
deal well with collisions, when PEBS events happen near to each
other. The records for the events can be collapsed into a single
one. However in practice collisions are extremely rare, as long as
different events are used. The periods are typically very large,
so any collision is unlikely. When collision happens, we can either
drop the PEBS record or use the record to serve multiple events.
This patch chooses the later approach.

Signed-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 88 +--
 1 file changed, 59 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 86ef5b0..1e3b8cf 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -989,18 +989,51 @@ static void setup_pebs_sample_data(struct perf_event 
*event,
 }
 
 static void __intel_pmu_pebs_event(struct perf_event *event,
-  struct pt_regs *iregs, void *__pebs)
+  struct pt_regs *iregs,
+  void *at, void *top, int count)
 {
+   struct perf_output_handle handle;
+   struct perf_event_header header;
struct perf_sample_data data;
struct pt_regs regs;
 
-   if (!intel_pmu_save_and_restart(event))
+   if (!intel_pmu_save_and_restart(event) &&
+   !(event->hw.flags & PERF_X86_EVENT_AUTO_RELOAD))
return;
 
-   setup_pebs_sample_data(event, iregs, __pebs, &data, ®s);
+   setup_pebs_sample_data(event, iregs, at, &data, ®s);
 
-   if (perf_event_overflow(event, &data, ®s))
+   if (perf_event_overflow(event, &data, ®s)) {
x86_pmu_stop(event, 0);
+   return;
+   }
+
+   if (count <= 1)
+   return;
+
+   at += x86_pmu.pebs_record_size;
+   count--;
+
+   perf_sample_data_init(&data, 0, event->hw.last_period);
+   perf_prepare_sample(&header, &data, event, ®s);
+
+   if (perf_output_begin(&handle, event, header.size * count))
+   return;
+
+   for (; at < top; at += x86_pmu.pebs_record_size) {
+   struct pebs_record_nhm *p = at;
+   if (!(p->status & (1 << event->hw.idx)))
+   continue;
+
+   setup_pebs_sample_data(event, iregs, at, &data, ®s);
+   perf_output_sample(&handle, &header, &data, event);
+
+   count--;
+   if (count == 0)
+   break;
+   }
+
+   perf_output_end(&handle);
 }
 
 static void intel_pmu_drain_pebs_core(struct pt_regs *iregs)
@@ -1041,61 +1074,58 @@ static void intel_pmu_drain_pebs_core(struct pt_regs 
*iregs)
WARN_ONCE(n > 1, "bad leftover pebs %d\n", n);
at += n - 1;
 
-   __intel_pmu_pebs_event(event, iregs, at);
+   __intel_pmu_pebs_event(event, iregs, at, top, 1);
 }
 
 static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
 {
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct debug_store *ds = cpuc->ds;
-   struct perf_event *event = NULL;
-   void *at, *top;
-   u64 status = 0;
+   struct perf_event *event;
+   void *base, *at, *top;
int bit;
+   int counts[MAX_PEBS_EVENTS] = {};
 
if (!x86_pmu.pebs_active)
return;
 
-   at  = (struct pebs_record_nhm *)(unsigned long)ds->pebs_buffer_base;
+   base = (struct pebs_record_nhm *)(unsigned long)ds->pebs_buffer_base;
top = (struct pebs_record_nhm *)(unsigned long)ds->pebs_index;
 
ds->pebs_index = ds->pebs_buffer_base;
 
-   if (unlikely(at > top))
+   if (unlikely(base >= top))
return;
 
-   /*
-* Should not happen, we program the threshold at 1 and do not
-* set a reset value.
-*/
-   WARN_ONCE(top - at > x86_pmu.max_pebs_events * x86_pmu.pebs_record_size,
- "Unexpected number of pebs records %ld\n",
- (long)(top - at) / x86_pmu.pebs_record_size);
-
-   for (; at < top; at += x86_pmu.pebs_record_size) {
+   for (at = base; at < top; at += x86_pmu.pebs_record_size) {
struct pebs_record_nhm *p = at;
-
+   /*
+* PEBS creates only one entry if multiple counters
+* overflow simultaneously.
+*/
for_each_set_bit(bit, (unsigned long *)&p->status,
 x86_pmu.max_pebs_events) {
event = cpuc->events[bit];
if (!test_bit(bit, cpuc->active_mask))

[PATCH v3 9/9] tools, perf: Allow the user to disable time stamps

2014-07-22 Thread Yan, Zheng

From: Andi Kleen 

Time stamps are always implicitely enabled for record currently.
The old --time/-T option is a nop.

Allow the user to disable timestamps by using --no-time

This can cause some minor misaccounting (by missing mmaps), but significantly
lowers the size of perf.data

The defaults are unchanged.

Signed-off-by: Andi Kleen 
---
 tools/perf/builtin-record.c | 1 +
 tools/perf/util/evsel.c | 9 ++---
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 378b85b..8728c7c 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -776,6 +776,7 @@ static const char * const record_usage[] = {
  */
 static struct record record = {
.opts = {
+   .sample_time = true,
.mmap_pages  = UINT_MAX,
.user_freq   = UINT_MAX,
.user_interval   = ULLONG_MAX,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8606175..1bc4093 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -632,9 +632,12 @@ void perf_evsel__config(struct perf_evsel *evsel, struct 
record_opts *opts)
if (opts->period)
perf_evsel__set_sample_bit(evsel, PERIOD);
 
-   if (!perf_missing_features.sample_id_all &&
-   (opts->sample_time || !opts->no_inherit ||
-target__has_cpu(&opts->target) || per_cpu))
+   /*
+* When the user explicitely disabled time don't force it here.
+*/
+   if (opts->sample_time &&
+   (!perf_missing_features.sample_id_all &&
+   (!opts->no_inherit || target__has_cpu(&opts->target) || per_cpu)))
perf_evsel__set_sample_bit(evsel, TIME);
 
if (opts->raw_samples) {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 2/9] perf, x86: use context switch callback to flush LBR stack

2014-07-22 Thread Yan, Zheng

Previous commit introduces context switch callback, its function
overlaps with the flush branch stack callback. So we can use the
context switch callback to flush LBR stack.

This patch adds code that uses the flush branch callback to
flush the LBR stack when task is being scheduled in. The callback
is enabled only when there are events use the LBR hardware. This
patch also removes all old flush branch stack code.

Signed-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event.c   |  7 ---
 arch/x86/kernel/cpu/perf_event.h   |  3 +-
 arch/x86/kernel/cpu/perf_event_intel.c | 14 +-
 arch/x86/kernel/cpu/perf_event_intel_lbr.c | 38 --
 include/linux/perf_event.h |  6 ---
 kernel/events/core.c   | 81 --
 6 files changed, 36 insertions(+), 113 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 7d22972..8868e9b 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1880,12 +1880,6 @@ static void x86_pmu_sched_task(struct perf_event_context 
*ctx, bool sched_in)
x86_pmu.sched_task(ctx, sched_in);
 }
 
-static void x86_pmu_flush_branch_stack(void)
-{
-   if (x86_pmu.flush_branch_stack)
-   x86_pmu.flush_branch_stack();
-}
-
 void perf_check_microcode(void)
 {
if (x86_pmu.check_microcode)
@@ -1912,7 +1906,6 @@ static struct pmu pmu = {
.commit_txn = x86_pmu_commit_txn,
 
.event_idx  = x86_pmu_event_idx,
-   .flush_branch_stack = x86_pmu_flush_branch_stack,
.sched_task = x86_pmu_sched_task,
 };
 
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index e70b352..d8165f3 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -428,7 +428,6 @@ struct x86_pmu {
void(*cpu_dead)(int cpu);
 
void(*check_microcode)(void);
-   void(*flush_branch_stack)(void);
void(*sched_task)(struct perf_event_context *ctx,
  bool sched_in);
 
@@ -685,6 +684,8 @@ void intel_pmu_pebs_disable_all(void);
 
 void intel_ds_init(void);
 
+void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in);
+
 void intel_pmu_lbr_reset(void);
 
 void intel_pmu_lbr_enable(struct perf_event *event);
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index adb02aa..ef926ee 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2035,18 +2035,6 @@ static void intel_pmu_cpu_dying(int cpu)
fini_debug_store_on_cpu(cpu);
 }
 
-static void intel_pmu_flush_branch_stack(void)
-{
-   /*
-* Intel LBR does not tag entries with the
-* PID of the current task, then we need to
-* flush it on ctxsw
-* For now, we simply reset it
-*/
-   if (x86_pmu.lbr_nr)
-   intel_pmu_lbr_reset();
-}
-
 PMU_FORMAT_ATTR(offcore_rsp, "config1:0-63");
 
 PMU_FORMAT_ATTR(ldlat, "config1:0-15");
@@ -2098,7 +2086,7 @@ static __initconst const struct x86_pmu intel_pmu = {
.cpu_starting   = intel_pmu_cpu_starting,
.cpu_dying  = intel_pmu_cpu_dying,
.guest_get_msrs = intel_guest_get_msrs,
-   .flush_branch_stack = intel_pmu_flush_branch_stack,
+   .sched_task = intel_pmu_lbr_sched_task,
 };
 
 static __init void intel_clovertown_quirk(void)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c 
b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index 9dd2459..a30bfab 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -181,13 +181,36 @@ void intel_pmu_lbr_reset(void)
intel_pmu_lbr_reset_64();
 }
 
-void intel_pmu_lbr_enable(struct perf_event *event)
+void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in)
 {
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
 
if (!x86_pmu.lbr_nr)
return;
+   /*
+* When sampling the branck stack in system-wide, it may be
+* necessary to flush the stack on context switch. This happens
+* when the branch stack does not tag its entries with the pid
+* of the current task. Otherwise it becomes impossible to
+* associate a branch entry with a task. This ambiguity is more
+* likely to appear when the branch stack supports priv level
+* filtering and the user sets it to monitor only at the user
+* level (which could be a useful measurement in system-wide
+* mode). In that case, the risk is high of having a branch
+* stack with branch from multiple tasks.
+*/
+   if (sched_in) {
+   intel_pmu_lbr_reset();
+   cpuc->lbr_context

[PATCH v3 4/9] perf, x86: introduce setup_pebs_sample_data()

2014-07-22 Thread Yan, Zheng

move codes that setup PEBS sample data to separate function

Signed-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 63 ++-
 1 file changed, 36 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index ab91b11..858c4ee 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -868,8 +868,10 @@ static inline u64 intel_hsw_transaction(struct 
pebs_record_hsw *pebs)
return txn;
 }
 
-static void __intel_pmu_pebs_event(struct perf_event *event,
-  struct pt_regs *iregs, void *__pebs)
+static void setup_pebs_sample_data(struct perf_event *event,
+  struct pt_regs *iregs, void *__pebs,
+  struct perf_sample_data *data,
+  struct pt_regs *regs)
 {
/*
 * We cast to the biggest pebs_record but are careful not to
@@ -877,21 +879,16 @@ static void __intel_pmu_pebs_event(struct perf_event 
*event,
 */
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct pebs_record_hsw *pebs = __pebs;
-   struct perf_sample_data data;
-   struct pt_regs regs;
u64 sample_type;
int fll, fst;
 
-   if (!intel_pmu_save_and_restart(event))
-   return;
-
fll = event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT;
fst = event->hw.flags & (PERF_X86_EVENT_PEBS_ST |
 PERF_X86_EVENT_PEBS_ST_HSW);
 
-   perf_sample_data_init(&data, 0, event->hw.last_period);
+   perf_sample_data_init(data, 0, event->hw.last_period);
 
-   data.period = event->hw.last_period;
+   data->period = event->hw.last_period;
sample_type = event->attr.sample_type;
 
/*
@@ -902,19 +899,19 @@ static void __intel_pmu_pebs_event(struct perf_event 
*event,
 * Use latency for weight (only avail with PEBS-LL)
 */
if (fll && (sample_type & PERF_SAMPLE_WEIGHT))
-   data.weight = pebs->lat;
+   data->weight = pebs->lat;
 
/*
 * data.data_src encodes the data source
 */
if (sample_type & PERF_SAMPLE_DATA_SRC) {
if (fll)
-   data.data_src.val = 
load_latency_data(pebs->dse);
+   data->data_src.val = 
load_latency_data(pebs->dse);
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
-   data.data_src.val =
+   data->data_src.val =
precise_store_data_hsw(event, 
pebs->dse);
else
-   data.data_src.val = 
precise_store_data(pebs->dse);
+   data->data_src.val = 
precise_store_data(pebs->dse);
}
}
 
@@ -928,35 +925,47 @@ static void __intel_pmu_pebs_event(struct perf_event 
*event,
 * PERF_SAMPLE_IP and PERF_SAMPLE_CALLCHAIN to function properly.
 * A possible PERF_SAMPLE_REGS will have to transfer all regs.
 */
-   regs = *iregs;
-   regs.flags = pebs->flags;
-   set_linear_ip(®s, pebs->ip);
-   regs.bp = pebs->bp;
-   regs.sp = pebs->sp;
+   *regs = *iregs;
+   regs->flags = pebs->flags;
+   set_linear_ip(regs, pebs->ip);
+   regs->bp = pebs->bp;
+   regs->sp = pebs->sp;
 
if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) {
-   regs.ip = pebs->real_ip;
-   regs.flags |= PERF_EFLAGS_EXACT;
-   } else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(®s))
-   regs.flags |= PERF_EFLAGS_EXACT;
+   regs->ip = pebs->real_ip;
+   regs->flags |= PERF_EFLAGS_EXACT;
+   } else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(regs))
+   regs->flags |= PERF_EFLAGS_EXACT;
else
-   regs.flags &= ~PERF_EFLAGS_EXACT;
+   regs->flags &= ~PERF_EFLAGS_EXACT;
 
if ((event->attr.sample_type & PERF_SAMPLE_ADDR) &&
x86_pmu.intel_cap.pebs_format >= 1)
-   data.addr = pebs->dla;
+   data->addr = pebs->dla;
 
if (x86_pmu.intel_cap.pebs_format >= 2) {
/* Only set the TSX weight when no memory weight. */
if ((event->attr.sample_type & PERF_SAMPLE_WEIGHT) && !fll)
-   data.weight = intel_hsw_weight(pebs);
+   data->weight = intel_hsw_weight(pebs);
 
if (event->attr.sample_type & PERF_SAMPLE_TRANSACTION)
-   data.txn = intel_hsw_transaction(pebs);
+   data->txn = intel_hsw_transaction(pebs);
}

[PATCH v3 5/9] perf, x86: large PEBS interrupt threshold

2014-07-22 Thread Yan, Zheng

PEBS always had the capability to log samples to its buffers without
an interrupt. Traditionally perf has not used this but always set the
PEBS threshold to one.

For frequently occuring events (like cycles or branches or load/stores)
this in term requires using a relatively high sampling period to avoid
overloading the system, by only processing PMIs. This in term increases
sampling error.

For the common cases we still need to use the PMI because the PEBS
hardware has various limitations. The biggest one is that it can not
supply a callgraph. It also requires setting a fixed period, as the
hardware does not support adaptive period. Another issue is that it
cannot supply a time stamp and some other options. To supply a TID it
requires flushing on context switch. It can however supply the IP, the
load/store address, TSX information, registers, and some other things.

So we can make PEBS work for some specific cases, basically as long as
you can do without a callgraph and can set the period you can use this
new PEBS mode.

The main benefit is the ability to support much lower sampling period
(down to -c 1000) without extensive overhead.

One use cases is for example to increase the resolution of the c2c tool.
Another is double checking when you suspect the standard sampling has
too much sampling error.

Some numbers on the overhead, using cycle soak, comparing
"perf record --no-time -e cycles:p -c" to "perf record -e cycles:p -c"

periodplain  multi  delta
10003 15 5  10
20003 15.7   4  11.7
40003 8.72.56.2
80003 4.11.42.7
133.61.22.4
834.41.43
103   0.60.40.2
203   0.40.30.1
403   0.30.20.1
1003  0.30.20.1

The interesting part is the delta between multi-pebs and normal pebs. Above
-c 103 it does not really matter because the basic overhead is so low.
With periods below 80003 it becomes interesting.

Note in some other workloads (e.g. kernbench) the smaller sampling periods
cause much more overhead without multi-pebs,  upto 80% (and throttling) have
been observed with -c 10003. multi pebs generally does not throttle.

Signed-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event_intel_ds.c | 40 +++
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 858c4ee..86ef5b0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -251,7 +251,7 @@ static int alloc_pebs_buffer(int cpu)
 {
struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
int node = cpu_to_node(cpu);
-   int max, thresh = 1; /* always use a single PEBS record */
+   int max;
void *buffer, *ibuffer;
 
if (!x86_pmu.pebs)
@@ -281,9 +281,6 @@ static int alloc_pebs_buffer(int cpu)
ds->pebs_absolute_maximum = ds->pebs_buffer_base +
max * x86_pmu.pebs_record_size;
 
-   ds->pebs_interrupt_threshold = ds->pebs_buffer_base +
-   thresh * x86_pmu.pebs_record_size;
-
return 0;
 }
 
@@ -708,15 +705,35 @@ struct event_constraint *intel_pebs_constraints(struct 
perf_event *event)
return &emptyconstraint;
 }
 
+/*
+ * Flags PEBS can handle without an PMI.
+ *
+ * TID can only be handled by flushing at context switch.
+ */
+#define PEBS_FREERUNNING_FLAGS \
+   (PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | \
+PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \
+PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \
+PERF_SAMPLE_TRANSACTION)
+
+static inline bool pebs_is_enabled(struct cpu_hw_events *cpuc)
+{
+   return (cpuc->pebs_enabled & ((1ULL << MAX_PEBS_EVENTS) - 1));
+}
+
 void intel_pmu_pebs_enable(struct perf_event *event)
 {
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
+   struct debug_store *ds = cpuc->ds;
+   bool first_pebs;
+   u64 threshold;
 
hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
if (!event->attr.freq)
hwc->flags |= PERF_X86_EVENT_AUTO_RELOAD;
 
+   first_pebs = !pebs_is_enabled(cpuc);
cpuc->pebs_enabled |= 1ULL << hwc->idx;
 
if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
@@ -724,6 +741,21 @@ void intel_pmu_pebs_enable(struct perf_event *event)
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
cpuc->pebs_enabled |= 1ULL << 63;
 
+   /*
+* When the event is constrained enough we can use a larger
+* threshold and run the event with less frequent PMI.
+*/
+   if (0 && /* disable this temporarily */
+   (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) &&
+   !(event->attr.sample_type & ~PEBS_FREERUNNING_FLAGS)) {
+   threshold = ds->pebs_absolute_maximum -
+   x86_

[PATCH v3 7/9] perf, x86: drain PEBS buffer during context switch

2014-07-22 Thread Yan, Zheng

Flush the PEBS buffer during context switch if PEBS interrupt threshold
is larger than one. This allows perf to supply TID for sample outputs.

Signed-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event.h   |  3 +++
 arch/x86/kernel/cpu/perf_event_intel.c | 11 +-
 arch/x86/kernel/cpu/perf_event_intel_ds.c  | 32 --
 arch/x86/kernel/cpu/perf_event_intel_lbr.c |  2 --
 4 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index fa8dfd4..c4a746c 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -148,6 +148,7 @@ struct cpu_hw_events {
 */
struct debug_store  *ds;
u64 pebs_enabled;
+   boolpebs_sched_cb_enabled;
 
/*
 * Intel LBR bits
@@ -683,6 +684,8 @@ void intel_pmu_pebs_enable_all(void);
 
 void intel_pmu_pebs_disable_all(void);
 
+void intel_pmu_pebs_sched_task(struct perf_event_context *ctx, bool sched_in);
+
 void intel_ds_init(void);
 
 void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in);
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index ef926ee..cb5a838 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2035,6 +2035,15 @@ static void intel_pmu_cpu_dying(int cpu)
fini_debug_store_on_cpu(cpu);
 }
 
+static void intel_pmu_sched_task(struct perf_event_context *ctx,
+bool sched_in)
+{
+   if (x86_pmu.pebs_active)
+   intel_pmu_pebs_sched_task(ctx, sched_in);
+   if (x86_pmu.lbr_nr)
+   intel_pmu_lbr_sched_task(ctx, sched_in);
+}
+
 PMU_FORMAT_ATTR(offcore_rsp, "config1:0-63");
 
 PMU_FORMAT_ATTR(ldlat, "config1:0-15");
@@ -2086,7 +2095,7 @@ static __initconst const struct x86_pmu intel_pmu = {
.cpu_starting   = intel_pmu_cpu_starting,
.cpu_dying  = intel_pmu_cpu_dying,
.guest_get_msrs = intel_guest_get_msrs,
-   .sched_task = intel_pmu_lbr_sched_task,
+   .sched_task = intel_pmu_sched_task,
 };
 
 static __init void intel_clovertown_quirk(void)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 1e3b8cf..99b07de0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -705,6 +705,18 @@ struct event_constraint *intel_pebs_constraints(struct 
perf_event *event)
return &emptyconstraint;
 }
 
+static inline void intel_pmu_drain_pebs_buffer(void)
+{
+   struct pt_regs regs;
+   x86_pmu.drain_pebs(®s);
+}
+
+void intel_pmu_pebs_sched_task(struct perf_event_context *ctx, bool sched_in)
+{
+   if (!sched_in)
+   intel_pmu_drain_pebs_buffer();
+}
+
 /*
  * Flags PEBS can handle without an PMI.
  *
@@ -745,13 +757,20 @@ void intel_pmu_pebs_enable(struct perf_event *event)
 * When the event is constrained enough we can use a larger
 * threshold and run the event with less frequent PMI.
 */
-   if (0 && /* disable this temporarily */
-   (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) &&
+   if ((hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) &&
!(event->attr.sample_type & ~PEBS_FREERUNNING_FLAGS)) {
threshold = ds->pebs_absolute_maximum -
x86_pmu.max_pebs_events * x86_pmu.pebs_record_size;
+   if (first_pebs) {
+   perf_sched_cb_user_inc(event->ctx->pmu);
+   cpuc->pebs_sched_cb_enabled = true;
+   }
} else {
threshold = ds->pebs_buffer_base + x86_pmu.pebs_record_size;
+   if (cpuc->pebs_sched_cb_enabled) {
+   perf_sched_cb_user_dec(event->ctx->pmu);
+   cpuc->pebs_sched_cb_enabled = false;
+   }
}
if (first_pebs || ds->pebs_interrupt_threshold > threshold)
ds->pebs_interrupt_threshold = threshold;
@@ -767,8 +786,17 @@ void intel_pmu_pebs_disable(struct perf_event *event)
 {
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
+   struct debug_store *ds = cpuc->ds;
+
+   if (ds->pebs_interrupt_threshold >
+   ds->pebs_buffer_base + x86_pmu.pebs_record_size)
+   intel_pmu_drain_pebs_buffer();
 
cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
+   if (cpuc->pebs_sched_cb_enabled && !pebs_is_enabled(cpuc)) {
+   perf_sched_cb_user_dec(event->ctx->pmu);
+   cpuc->pebs_sched_cb_enabled = false;
+   }
 
if (event->hw.constraint->flags & PERF_X86_EVENT_PEBS_LDLAT)
cpuc->pebs_enabled &= ~(1ULL << (hwc->idx + 32));
diff --git a/arch/x86/kernel/cpu/perf_event_int

[PATCH v3 0/7] perf, x86: large PEBS interrupt threshold

2014-07-22 Thread Yan, Zheng

This patch series implements large PEBS interrupt threshold. For some
limited cases, it can significantly reduce the sample overhead. Please
read patch 6's commit message for more information.

changes since v1:
  - drop patch 'perf, core: Add all PMUs to pmu_idr'
  - add comments for case that multiple counters overflow simultaneously
changes since v2:
  - rename perf_sched_cb_{enable,disable} to perf_sched_cb_user_{inc,dec} 
  - use flag to indicate auto reload mechanism
  - move codes that setup PEBS sample data to separate function
  - output the PEBS records in batch 
  - enable this for All (PEBS capable) hardware 
  - more description for the multiplex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] drm/panel: add support for InnoLux N156BGE-L21 panel

2014-07-22 Thread Thierry Reding

On Tue, Jul 22, 2014 at 08:38:55AM +0200, Alban Bedel wrote:
> This panel is used by the Medcom Wide and supported by the
> simple-panel driver.
> 
> Signed-off-by: Alban Bedel 
> ---
> v2: * Added the v/hsync pulses for correctness (the panel doesn't
>   really needs them)
> * Fixed the size to report the physical size in mm
> ---
>  .../bindings/panel/innolux,n156bge-l21.txt |  7 ++
>  drivers/gpu/drm/panel/panel-simple.c   | 25 
> ++
>  2 files changed, 32 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/panel/innolux,n156bge-l21.txt

Applied, thanks.

I slightly reordered where the new panel was added, since they're sorted
alphabetically by vendor, then device.

Thierry


pgp6KyIHXcoS9.pgp
Description: PGP signature

[PATCH v3 1/9] perf, core: introduce pmu context switch callback

2014-07-22 Thread Yan, Zheng

The callback is invoked when process is scheduled in or out.
It provides mechanism for later patches to save/store the LBR
stack. For the schedule in case, the callback is invoked at
the same place that flush branch stack callback is invoked.
So it also can replace the flush branch stack callback. To
avoid unnecessary overhead, the callback is enabled only when
there are events use the LBR stack.

Signed-off-by: Yan, Zheng 
---
 arch/x86/kernel/cpu/perf_event.c |  7 +
 arch/x86/kernel/cpu/perf_event.h |  2 ++
 include/linux/perf_event.h   |  9 ++
 kernel/events/core.c | 63 
 4 files changed, 81 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 2bdfbff..7d22972 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1874,6 +1874,12 @@ static const struct attribute_group 
*x86_pmu_attr_groups[] = {
NULL,
 };
 
+static void x86_pmu_sched_task(struct perf_event_context *ctx, bool sched_in)
+{
+   if (x86_pmu.sched_task)
+   x86_pmu.sched_task(ctx, sched_in);
+}
+
 static void x86_pmu_flush_branch_stack(void)
 {
if (x86_pmu.flush_branch_stack)
@@ -1907,6 +1913,7 @@ static struct pmu pmu = {
 
.event_idx  = x86_pmu_event_idx,
.flush_branch_stack = x86_pmu_flush_branch_stack,
+   .sched_task = x86_pmu_sched_task,
 };
 
 void arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 3b2f9bd..e70b352 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -429,6 +429,8 @@ struct x86_pmu {
 
void(*check_microcode)(void);
void(*flush_branch_stack)(void);
+   void(*sched_task)(struct perf_event_context *ctx,
+ bool sched_in);
 
/*
 * Intel Arch Perfmon v2+
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 707617a..fe92e6b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -262,6 +262,13 @@ struct pmu {
 * flush branch stack on context-switches (needed in cpu-wide mode)
 */
void (*flush_branch_stack)  (void);
+
+   /*
+* context-switches callback for CPU PMU. Other PMUs shouldn't set
+* this callback
+*/
+   void (*sched_task)  (struct perf_event_context *ctx,
+bool sched_in);
 };
 
 /**
@@ -557,6 +564,8 @@ extern void perf_event_delayed_put(struct task_struct 
*task);
 extern void perf_event_print_debug(void);
 extern void perf_pmu_disable(struct pmu *pmu);
 extern void perf_pmu_enable(struct pmu *pmu);
+extern void perf_sched_cb_user_inc(struct pmu *pmu);
+extern void perf_sched_cb_user_dec(struct pmu *pmu);
 extern int perf_event_task_disable(void);
 extern int perf_event_task_enable(void);
 extern int perf_event_refresh(struct perf_event *event, int refresh);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 67e3b9c..7431bec 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -144,6 +144,7 @@ enum event_type_t {
 struct static_key_deferred perf_sched_events __read_mostly;
 static DEFINE_PER_CPU(atomic_t, perf_cgroup_events);
 static DEFINE_PER_CPU(atomic_t, perf_branch_stack_events);
+static DEFINE_PER_CPU(int, perf_sched_cb_users);
 
 static atomic_t nr_mmap_events __read_mostly;
 static atomic_t nr_comm_events __read_mostly;
@@ -2362,6 +2363,58 @@ unlock:
}
 }
 
+void perf_sched_cb_user_inc(struct pmu *pmu)
+{
+   this_cpu_inc(perf_sched_cb_users);
+}
+
+void perf_sched_cb_user_dec(struct pmu *pmu)
+{
+   this_cpu_dec(perf_sched_cb_users);
+}
+
+/*
+ * This function provides the context switch callback to the lower code
+ * layer. It is invoked ONLY when the context switch callback is enabled.
+ */
+static void perf_pmu_sched_task(struct task_struct *prev,
+   struct task_struct *next,
+   bool sched_in)
+{
+   struct perf_cpu_context *cpuctx;
+   struct pmu *pmu;
+   unsigned long flags;
+
+   if (prev == next)
+   return;
+
+   local_irq_save(flags);
+
+   rcu_read_lock();
+
+   list_for_each_entry_rcu(pmu, &pmus, entry) {
+   if (pmu->sched_task) {
+   cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
+
+   perf_ctx_lock(cpuctx, cpuctx->task_ctx);
+
+   perf_pmu_disable(pmu);
+
+   pmu->sched_task(cpuctx->task_ctx, sched_in);
+
+   perf_pmu_enable(pmu);
+
+   perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
+   /* only CPU PMU has context switch callback */
+   break;
+   }
+   }
+

Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-22 Thread Theodore Ts'o

On Tue, Jul 22, 2014 at 09:30:05AM +0200, Peter Zijlstra wrote:
> > I introduce a new API for allocating page from non-movable area.
> > It is useful for ext4 and others that want to hold page cache for a long 
> > time.
> 
> There's no word on why you can't teach ext4 to still migrate that page.
> For all I know it might be impossible, but at least mention why.

In theory we might be able to do it, but it's only a single 4k page,
and we'd have to add RCU locking all over the place in order to be
able to switch out the superblock structure, since we reference it all
over the place inside fs/ext4.  The question I'd ask is is it worth
it.

I suspect the bigger deal is that there are all sorts of inodes and
dentries which are effectively pinned and thus, impossible to migrate.
This probably locks down many more pages (by a fact of at least 10 or
20), and I'd think that's something you would be much more interested
in fixing.

   - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RESEND PATCH] mfd: max14577: Don't pass IRQ domain to mfd_add_devices

2014-07-22 Thread Krzysztof Kozlowski

The max14577 MFD cells do not have any resources so the IRQ domain
passed to mfd_add_devices is not used.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/mfd/max14577.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/mfd/max14577.c b/drivers/mfd/max14577.c
index 4a5e885383f8..ba2ac9c10b12 100644
--- a/drivers/mfd/max14577.c
+++ b/drivers/mfd/max14577.c
@@ -372,8 +372,7 @@ static int max14577_i2c_probe(struct i2c_client *i2c,
}
 
ret = mfd_add_devices(max14577->dev, -1, mfd_devs,
-   mfd_devs_size, NULL, 0,
-   regmap_irq_get_domain(max14577->irq_data));
+   mfd_devs_size, NULL, 0, NULL);
if (ret < 0)
goto err_mfd;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

3.16-rc6+: r8169 fresh new irq lock inversion

2014-07-22 Thread Borislav Petkov

Hi all,

I got this new lockdep splat after booting rc6 + tip/master this
morning. I haven't seen it before so it must be new in rc6 which should
be easy to pinpoint... or maybe there's already a fix queued somewhere. :)


[   10.033876] NET: Registered protocol family 10
[   11.510454] r8169 :02:00.0 eth0: link down
[   11.510566] r8169 :02:00.0 eth0: link down
[   11.522573] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   13.024024] r8169 :02:00.0 eth0: link up
[   13.031636] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   14.024289] 
[   14.028860] =
[   14.038419] [ INFO: possible irq lock inversion dependency detected ]
[   14.047992] 3.16.0-rc6+ #1 Not tainted
[   14.054806] -
[   14.064269] swapper/2/0 just changed the state of lock:
[   14.072523]  (&tb->tb6_lock){++--..}, at: [] 
ip6_ins_rt+0x47/0x80 [ipv6]
[   14.083983] but this lock took another, SOFTIRQ-READ-unsafe lock in the past:
[   14.094142]  (nl_table_lock){.+.?..}
[   14.094142] 
[   14.094142] and interrupts could create inverse lock ordering between them.
[   14.094142] 
[   14.115623] 
[   14.115623] other info that might help us debug this:
[   14.128001]  Possible interrupt unsafe locking scenario:
[   14.128001] 
[   14.140632]CPU0CPU1
[   14.148080]
[   14.155494]   lock(nl_table_lock);
[   14.161756]local_irq_disable();
[   14.170487]lock(&tb->tb6_lock);
[   14.179272]lock(nl_table_lock);
[   14.188002]   
[   14.193397] lock(&tb->tb6_lock);
[   14.199835] 
[   14.199835]  *** DEADLOCK ***
[   14.199835] 
[   14.214025] 1 lock held by swapper/2/0:
[   14.220632]  #0:  (rcu_read_lock){..}, at: [] 
__netif_receive_skb_core+0x65/0xc00
[   14.232984] 
[   14.232984] the shortest dependencies between 2nd lock and 1st lock:
[   14.246511]  -> (nl_table_lock){.+.?..} ops: 4153 {
[   14.254361] HARDIRQ-ON-R at:
[   14.260442]   [] 
__lock_acquire+0x39c/0x2230
[   14.270889]   [] 
lock_acquire+0xb9/0x200
[   14.280886]   [] 
_raw_read_lock+0x44/0x80
[   14.291002]   [] 
netlink_broadcast_filtered+0x42/0x360
[   14.302244]   [] 
netlink_broadcast+0x1d/0x20
[   14.312580]   [] nlmsg_notify+0xbd/0xd0
[   14.322462]   [] 
rtmsg_ifinfo+0xdc/0x100
[   14.332448]   [] 
register_netdevice+0x495/0x570
[   14.343009]   [] 
register_netdev+0x1f/0x30
[   14.353107]   [] 
loopback_net_init+0x3b/0x7f
[   14.363390]   [] 
ops_init.constprop.8+0xaf/0x180
[   14.374001]   [] 
register_pernet_operations.isra.5+0x6c/0xb0
[   14.385668]   [] 
register_pernet_device+0x27/0x70
[   14.396421]   [] 
net_dev_init+0x18c/0x1e8
[   14.406439]   [] 
do_one_initcall+0xa0/0x1f0
[   14.416660]   [] 
kernel_init_freeable+0x114/0x19c
[   14.427385]   [] kernel_init+0xe/0xe0
[   14.437061]   [] 
ret_from_fork+0x7c/0xb0
[   14.446978] IN-SOFTIRQ-R at:
[   14.452894]   [] 
__lock_acquire+0x929/0x2230
[   14.463166]   [] 
lock_acquire+0xb9/0x200
[   14.473085]   [] 
_raw_read_lock+0x44/0x80
[   14.483106]   [] 
netlink_broadcast_filtered+0x42/0x360
[   14.494290]   [] 
netlink_broadcast+0x1d/0x20
[   14.504553]   [] nlmsg_notify+0xbd/0xd0
[   14.514385]   [] rtnl_notify+0x3b/0x40
[   14.524087]   [] 
__neigh_notify+0xb7/0xe0
[   14.534057]   [] 
neigh_update+0x444/0x950
[   14.544043]   [] 
arp_process+0x30d/0x820
[   14.553910]   [] arp_rcv+0x10b/0x140
[   14.563412]   [] 
__netif_receive_skb_core+0x24c/0xc00
[   14.574412]   [] 
__netif_receive_skb+0x1b/0x70
[   14.584771]   [] 
netif_receive_skb_internal+0x2d/0x1e0
[   14.595832]   [] 
napi_gro_receive+0x90/0x1b0
[   14.606029]   [] 
rtl8169_poll+0x30b/0x640 [r8169]
[   14.616675]   [] 
net_rx_action+0x132/0x310
[   14.626757]   [] 
__do_softirq+0xed/0x4d0
[   14.636621]   [] irq_exit+0x8e/0xb0
[   14.646038]   [] do_IRQ+0x71/0x110
[   14.655368]   [] ret_from_intr+0x0/0x13
[   14.665093]   [] 
cpuidle_enter+0x17/0x20
[   14.674883]   [] 
cpu_startup_entry+0x287/0x7c0
[   14.685180]

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-22 Thread Oded Gabbay


On 22/07/14 10:28, Daniel Vetter wrote:

On Mon, Jul 21, 2014 at 03:03:07PM -0400, Jerome Glisse wrote:

On Mon, Jul 21, 2014 at 09:41:29PM +0300, Oded Gabbay wrote:

On 21/07/14 21:22, Daniel Vetter wrote:

On Mon, Jul 21, 2014 at 7:28 PM, Oded Gabbay  wrote:

I'm not sure whether we can do the same trick with the hw scheduler. But
then unpinning hw contexts will drain the pipeline anyway, so I guess we
can just stop feeding the hw scheduler until it runs dry. And then unpin
and evict.

So, I'm afraid but we can't do this for AMD Kaveri because:


Well as long as you can drain the hw scheduler queue (and you can do
that, worst case you have to unmap all the doorbells and other stuff
to intercept further submission from userspace) you can evict stuff.


I can't drain the hw scheduler queue, as I can't do mid-wave preemption.
Moreover, if I use the dequeue request register to preempt a queue
during a dispatch it may be that some waves (wave groups actually) of
the dispatch have not yet been created, and when I reactivate the mqd,
they should be created but are not. However, this works fine if you use
the HIQ. the CP ucode correctly saves and restores the state of an
outstanding dispatch. I don't think we have access to the state from
software at all, so it's not a bug, it is "as designed".



I think here Daniel is suggesting to unmapp the doorbell page, and track
each write made by userspace to it and while unmapped wait for the gpu to
drain or use some kind of fence on a special queue. Once GPU is drain we
can move pinned buffer, then remap the doorbell and update it to the last
value written by userspace which will resume execution to the next job.


Exactly, just prevent userspace from submitting more. And if you have
misbehaving userspace that submits too much, reset the gpu and tell it
that you're sorry but won't schedule any more work.


I'm not sure how you intend to know if a userspace misbehaves or not. Can you 
elaborate ?


Oded


We have this already in i915 (since like all other gpus we're not
preempting right now) and it works. There's some code floating around to
even restrict the reset to _just_ the offending submission context, with
nothing else getting corrupted.

You can do all this with the doorbells and unmapping them, but it's a
pain. Much easier if you have a real ioctl, and I haven't seen anyone with
perf data indicating that an ioctl would be too much overhead on linux.
Neither in this thread nor internally here at intel.
-Daniel



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 00/25] AMDKFD kernel driver

2014-07-22 Thread Oded Gabbay


On 22/07/14 10:40, Daniel Vetter wrote:

On Tue, Jul 22, 2014 at 09:28:51AM +0200, Daniel Vetter wrote:

On Mon, Jul 21, 2014 at 03:03:07PM -0400, Jerome Glisse wrote:

On Mon, Jul 21, 2014 at 09:41:29PM +0300, Oded Gabbay wrote:

On 21/07/14 21:22, Daniel Vetter wrote:

On Mon, Jul 21, 2014 at 7:28 PM, Oded Gabbay  wrote:

I'm not sure whether we can do the same trick with the hw scheduler. But
then unpinning hw contexts will drain the pipeline anyway, so I guess we
can just stop feeding the hw scheduler until it runs dry. And then unpin
and evict.

So, I'm afraid but we can't do this for AMD Kaveri because:


Well as long as you can drain the hw scheduler queue (and you can do
that, worst case you have to unmap all the doorbells and other stuff
to intercept further submission from userspace) you can evict stuff.


I can't drain the hw scheduler queue, as I can't do mid-wave preemption.
Moreover, if I use the dequeue request register to preempt a queue
during a dispatch it may be that some waves (wave groups actually) of
the dispatch have not yet been created, and when I reactivate the mqd,
they should be created but are not. However, this works fine if you use
the HIQ. the CP ucode correctly saves and restores the state of an
outstanding dispatch. I don't think we have access to the state from
software at all, so it's not a bug, it is "as designed".



I think here Daniel is suggesting to unmapp the doorbell page, and track
each write made by userspace to it and while unmapped wait for the gpu to
drain or use some kind of fence on a special queue. Once GPU is drain we
can move pinned buffer, then remap the doorbell and update it to the last
value written by userspace which will resume execution to the next job.


Exactly, just prevent userspace from submitting more. And if you have
misbehaving userspace that submits too much, reset the gpu and tell it
that you're sorry but won't schedule any more work.

We have this already in i915 (since like all other gpus we're not
preempting right now) and it works. There's some code floating around to
even restrict the reset to _just_ the offending submission context, with
nothing else getting corrupted.

You can do all this with the doorbells and unmapping them, but it's a
pain. Much easier if you have a real ioctl, and I haven't seen anyone with
perf data indicating that an ioctl would be too much overhead on linux.
Neither in this thread nor internally here at intel.


Aside: Another reason why the ioctl is better than the doorbell is
integration with other drivers. Yeah I know this is about compute, but
sooner or later someone will want to e.g. post-proc video frames between
the v4l capture device and the gpu mpeg encoder. Or something else fancy.

Then you want to be able to somehow integrate into a cross-driver fence
framework like android syncpts, and you can't do that without an ioctl for
the compute submissions.
-Daniel



I assume you talk about interop between graphics and compute. For that, we have 
a module that is now being tested, and indeed uses an ioctl to map a graphic 
object to compute process address space. However, after the translation is done, 
the work is done only in userspace.


Oded
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] blk-mq: bitmap tag: fix potential unwakeable sleep in bt_get()

2014-07-22 Thread Alexander Gordeev

On Mon, Jul 21, 2014 at 11:11:44AM +0200, Alexander Gordeev wrote:
> > My bad, was looking at the wrong sources. What is the race? If the
> > clear and wakeup come in after the prepare_to_wait() but before the
> > io_schedule(), the io_schedule() will be a no-op and it wont
> > actually sleep. Your commit message doesn't mention any particular
> > details.
> 
> Let me examine the code.. It was few weeks ago and I can not
> remember what was that. And the changelog does not help :)

Jens,

I can not recall what was my concern. Sorry for the noise.

Thanks!

> > -- 
> > Jens Axboe
> > 

-- 
Regards,
Alexander Gordeev
agord...@redhat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] kfree followed by a TRACE_RET before ref should be ok

2014-07-22 Thread Nicholas Mc Guire


kfree.cocci currently triggers on constructs like

drivers/staging/rts5208/spi.c:
596if (retval < 0) {
597kfree(buf);
598rtsx_clear_spi_error(chip);
599spi_set_err_code(chip, SPI_HW_ERR);
600TRACE_RET(chip, STATUS_FAIL);
601}
602
603rtsx_stor_access_xfer_buf(buf, pagelen, srb, &index, &offset,
TO_XFER_BUF);

with:

./drivers/staging/rts5208/spi.c:603:28-31: ERROR: reference preceded 
by free on line 597

but this should be fine - so TRACE_RET is added to the list of calls 
"protecting" access to freed objects see drivers/staging/rts5208/trace.h


Acked-by: Julia Lawall 
Signed-off-by: Nicholas Mc Guire 
---
 scripts/coccinelle/free/kfree.cocci |2 ++
 1 file changed, 2 insertions(+)

diff --git a/scripts/coccinelle/free/kfree.cocci 
b/scripts/coccinelle/free/kfree.cocci
index 577b780..04d3f4f 100644
--- a/scripts/coccinelle/free/kfree.cocci
+++ b/scripts/coccinelle/free/kfree.cocci
@@ -101,6 +101,8 @@ kfree@p1(E,...)
 |
  return_ACPI_STATUS(...)
 |
+ TRACE_RET(...)
+|
  E@p2 // bad use
 )
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86_32, entry: store badsys error code in %eax

2014-07-22 Thread Sven Wegener

Commit 554086d ("x86_32, entry: Do syscall exit work on badsys
(CVE-2014-4508)") introduced a regression in the x86_32 syscall entry
code, resulting in syscall() not returning proper errors for undefined
syscalls on CPUs supporting the sysenter feature.

The following code:

> int result = syscall(666);
> printf("result=%d errno=%d error=%s\n", result, errno, strerror(errno));

results in:

> result=666 errno=0 error=Success

Obviously, the syscall return value is the called syscall number, but it
should have been an ENOSYS error. When run under ptrace it behaves
correctly, which makes it hard to debug in the wild:

> result=-1 errno=38 error=Function not implemented

The %eax register is the return value register. For debugging via ptrace
the syscall entry code stores the complete register context on the
stack. The badsys handlers only store the ENOSYS error code in the
ptrace register set and do not set %eax like a regular syscall handler
would. The old resume_userspace call chain contains code that clobbers
%eax and it restores %eax from the ptrace registers afterwards. The same
goes for the ptrace-enabled call chain. When ptrace is not used, the
syscall return value is the passed-in syscall number from the untouched
%eax register.

Use %eax as the return value register in syscall_badsys and
sysenter_badsys, like a real syscall handler does, and have the caller
push the value onto the stack for ptrace access.

Signed-off-by: Sven Wegener 
Reviewed-and-tested-by: Andy Lutomirski 
Cc: sta...@vger.kernel.org
---

I've updated the commit message and added the Reviewed-and-tested-by and 
Cc.

 arch/x86/kernel/entry_32.S | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index dbaa23e..0d0c9d4 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -425,8 +425,8 @@ sysenter_do_call:
cmpl $(NR_syscalls), %eax
jae sysenter_badsys
call *sys_call_table(,%eax,4)
-   movl %eax,PT_EAX(%esp)
 sysenter_after_call:
+   movl %eax,PT_EAX(%esp)
LOCKDEP_SYS_EXIT
DISABLE_INTERRUPTS(CLBR_ANY)
TRACE_IRQS_OFF
@@ -502,6 +502,7 @@ ENTRY(system_call)
jae syscall_badsys
 syscall_call:
call *sys_call_table(,%eax,4)
+syscall_after_call:
movl %eax,PT_EAX(%esp)  # store the return value
 syscall_exit:
LOCKDEP_SYS_EXIT
@@ -675,12 +676,12 @@ syscall_fault:
 END(syscall_fault)
 
 syscall_badsys:
-   movl $-ENOSYS,PT_EAX(%esp)
-   jmp syscall_exit
+   movl $-ENOSYS,%eax
+   jmp syscall_after_call
 END(syscall_badsys)
 
 sysenter_badsys:
-   movl $-ENOSYS,PT_EAX(%esp)
+   movl $-ENOSYS,%eax
jmp sysenter_after_call
 END(syscall_badsys)
CFI_ENDPROC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/5] gpio: move gpio_ensure_requested() into legacy C file

2014-07-22 Thread Varka Bhadram


On 07/22/2014 12:47 PM, Alexandre Courbot wrote:

(...)


+   if (WARN(test_and_set_bit(FLAG_REQUESTED, &desc->flags) == 0,
+   "autorequest GPIO-%d\n", desc_to_gpio(desc))) {
+   if (!try_module_get(chip->owner)) {
+   gpiod_err(desc, "%s: module can't be gotten\n",
+   __func__);


Should match open parenthesis '('

gpiod_err(desc, "%s: module can't be gotten\n",
  __func__);


+   clear_bit(FLAG_REQUESTED, &desc->flags);
+   /* lose */
+   err = -EIO;
+   goto end;
+   }
+   desc->label = "[auto]";
+   /* caller must chip->request() w/o spinlock */
+   if (chip->request)
+   request = true;
+   }
+
+end:
+   spin_unlock_irqrestore(&gpio_lock, flags);
+
+   if (request) {
+   might_sleep_if(chip->can_sleep);
+   err = chip->request(chip, gpio_chip_hwgpio(desc));
+
+   if (err < 0) {
+   gpiod_dbg(desc, "%s: chip request fail, %d\n",
+   __func__, err);


Dto..

(...)


--
Regards,
Varka Bhadram.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] mtd: nand: add ONFI timing mode to nand_timings converter

2014-07-22 Thread Boris BREZILLON

Hi Brian,

On Mon, 21 Jul 2014 19:39:47 -0700
Brian Norris  wrote:

> On Fri, Jul 11, 2014 at 09:49:42AM +0200, Boris BREZILLON wrote:
> > --- /dev/null
> > +++ b/drivers/mtd/nand/nand_timings.c
> > @@ -0,0 +1,250 @@
> > +/*
> > + *  Copyright (C) 2014 Free Electrons
> > + *
> > + *  Author: Boris BREZILLON 
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + */
> > +#include 
> [...]
> > +/**
> > + * onfi_async_timing_mode_to_sdr_timings - [NAND Interface] Retrieve NAND
> > + * timings according to the given ONFI timing mode
> > + * @mode: ONFI timing mode
> > + */
> > +const struct nand_sdr_timings *onfi_async_timing_mode_to_sdr_timings(int 
> > mode)
> > +{
> > +   if (mode < 0 || mode >= ARRAY_SIZE(onfi_sdr_timings))
> 
> Might need  for this.
> 
> > +   return ERR_PTR(-EINVAL);
> 
> And  for this.
> 
> > +
> > +   return &onfi_sdr_timings[mode];
> > +}
> > +EXPORT_SYMBOL(onfi_async_timing_mode_to_sdr_timings);
> 
>  for this.

Thanks for fixing these issues (I tend to rely on other inclusions when
the compiler does not yell at me :-))

Best Regards,

Boris



-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/4] mfd: imanager2: Add Advantech EC APIs support for IT8516/18/28

2014-07-22 Thread Lee Jones

On Mon, 14 Jul 2014, Wei-Chun Pan wrote:

You have to write a commit log here.  What is this?  Why is it needed?
What problem does it solve?  What happens if it's not provided?  How
is it implemented?  Etc etc.

> Signed-off-by: Wei-Chun Pan 
> ---
>  drivers/mfd/imanager2_ec.c | 615 
> +
>  1 file changed, 615 insertions(+)
>  create mode 100644 drivers/mfd/imanager2_ec.c
> 
> diff --git a/drivers/mfd/imanager2_ec.c b/drivers/mfd/imanager2_ec.c
> new file mode 100644
> index 000..f7a0003
> --- /dev/null
> +++ b/drivers/mfd/imanager2_ec.c
> @@ -0,0 +1,615 @@
> +/*
> + * imanager2_ec.c - MFD accessing driver of Advantech EC IT8516/18/28
> + * Copyright (C) 2014  Richard Vidal-Dorsch 
> + *
> + * This program is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see .

I'd prefer if you used the short version.

> + */
> +
> +#include 
> +#include 

I'm sure that you're missing a whole bunch of header files here.  You
are to include all files that you make use of in _this_ file.

> +#include 

Comment this line out to see what is not defined.  At the very least
you will need export.h and err.h.

[...]

> +static int imanager2_read_mailbox(u32 ecflag, u8 offset, u8 *data)
> +{
> + if (ecflag & EC_FLAG_IO_MAILBOX) {
> + int ret = ec_wait_ibc0();
> + if (ret)
> + return ret;
> + inb(EC_IO_PORT_DATA);
> + outb(offset + EC_IO_CMD_READ_OFFSET, EC_IO_PORT_CMD);
> +
> + return ec_inb_after_obf1(data);
> + } else {
> + outb(offset, EC_ITE_PORT_OFS);
> + *data = inb(EC_ITE_PORT_DATA);
> + }
> +
> + return 0;
> +}

All of the Mailbox controller code in this file should live in
drivers/mailbox.

Also, does your Mailbox controller support IRQs?

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/5] futex: introduce an optimistic spinning futex

2014-07-22 Thread Thomas Gleixner

On Tue, 22 Jul 2014, Peter Zijlstra wrote:
> Anyway, there is one big fail in the entire futex stack that we 'need'
> to sort some day and that is NUMA. Some people (again database people)
> explicitly do not use futexes and instead use sysvsem because of this.
> 
> The problem with numa futexes is that because they're vaddr based there
> is no (persistent) node information. You always end up having to fall
> back to looking in all nodes before you can guarantee there is no
> matching futex.
> 
> One way to achieve it is by extending the futex value to include a node
> number, but that's obviously a complete ABI break. Then again, it should
> be pretty straight fwd, since the node number doesn't need to be part of
> the actual atomic update part, just part of the userspace storage.

So you want per node hash buckets, right? Fair enough, but how do you
make sure, that no thread/process on a different node is fiddling with
that "node bound" futex as well?

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] iio: adc: exynos_adc: Add support for S3C24xx ADC

2014-07-22 Thread Arnd Bergmann

On Tuesday 22 July 2014 11:11:14 Chanwoo Choi wrote:
> This patch add support for s3c2410/s3c2416/s3c2440/s3c2443 ADC. The s3c24xx
> is alomost same as ADCv1. But, There are a little difference as following:
> - ADCMUX register address to select channel
> - ADCDAT mask (10bit or 12bit ADC resolution according to SoC version)

Very good, thanks for doing this patch!

(adding Heiko to Cc, he's probably interested in seeing this as well.

One comment:
 
> @@ -101,12 +107,14 @@ struct exynos_adc {
>   struct completion   completion;
>  
>   u32 value;
> + u32 value2;
>   unsigned intversion;
>  };
> ...
> @@ -365,7 +448,7 @@ static int exynos_read_raw(struct iio_dev *indio_dev,
>   ret = -ETIMEDOUT;
>   } else {
>   *val = info->value;
> - *val2 = 0;
> + *val2 = info->value2;
>   ret = IIO_VAL_INT;
>   }
>  
> @@ -377,9 +460,11 @@ static int exynos_read_raw(struct iio_dev *indio_dev,
>  static irqreturn_t exynos_adc_isr(int irq, void *dev_id)
>  {
>   struct exynos_adc *info = (struct exynos_adc *)dev_id;
> + u32 mask = info->data->mask;
>  
>   /* Read value */
> - info->value = readl(ADC_V1_DATX(info->regs)) & ADC_DATX_MASK;
> + info->value = readl(ADC_V1_DATX(info->regs)) & mask;
> + info->value2 = readl(ADC_V1_DATY(info->regs)) & mask;
>  
>   /* clear irq */
>   if (info->data->clear_irq)

If I understand it right, this would only be necessary if we want
to do the touchscreen driver as a separate iio client using the
in-kernel interfaces. As Jonathan Cameron commented, we probably
don't want to do that though. Even if we do, it should be a separate
patch and not mixed in with the s3c24xx support.

Aside from this:

Acked-by: Arnd Bergmann 

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/17] drm/radeon: use common fence implementation for fences

2014-07-22 Thread Christian König


Am 22.07.2014 06:05, schrieb Dave Airlie:

On 9 July 2014 22:29, Maarten Lankhorst  wrote:

Signed-off-by: Maarten Lankhorst 
---
  drivers/gpu/drm/radeon/radeon.h|   15 +-
  drivers/gpu/drm/radeon/radeon_device.c |   60 -
  drivers/gpu/drm/radeon/radeon_fence.c  |  223 ++--
  3 files changed, 248 insertions(+), 50 deletions(-)


 From what I can see this is still suffering from the problem that we
need to find a proper solution to,

My summary of the issues after talking to Jerome and Ben and
re-reading things is:

We really need to work out a better interface into the drivers to be
able to avoid random atomic entrypoints,


Which is exactly what I criticized from the very first beginning. Good 
to know that I'm not the only one thinking that this isn't such a good idea.



I'm sure you have some ideas and I think you really need to
investigate them to move this thing forward,
even it if means some issues with android sync pts.


Actually I think that TTMs fence interface already gave quite a good 
hint how it might look like. I can only guess that this won't fit with 
the Android stuff, otherwise I can't see a good reason why we didn't 
stick with that.



but none of the two major drivers seem to want the interface as-is so
something needs to give

My major question is why we need an atomic callback here at all, what
scenario does it cover?


Agree totally. As far as I can see all current uses of the interface are 
of the kind of waiting for a fence to signal.


No need for any callback from one driver into another, especially not in 
atomic context. If a driver needs such a functionality it should just 
start up a kernel thread and do it's waiting there.


This obviously shouldn't be an obstacle for pure hardware 
implementations where one driver signals a semaphore another driver is 
waiting for, or a high signal on an interrupt line directly wired 
between two chips. And I think this is a completely different topic and 
not necessarily part of the common fence interface we should currently 
focus on.


Christian.


Surely we can use a workqueue based callback to ask a driver to check
its signalling, is it really
that urgent?

Dave.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fix me in netx-regs.h

2014-07-22 Thread Arnd Bergmann

On Tuesday 22 July 2014 00:16:26 Nick Krause wrote:
> Hey Russell,
> I did give thought to our previous conversations and I will still do
> fix mes but am going to be more careful
> with I submit them. Furthermore it seems  #define
> NETX_GPIO_COUNTER_CTRL_GPIO_RE is not defined
> correctly. As the maintainer what should I define it to?

Are you offering to become the maintainer for netx?

Try to get hold of a reference manual for the chip. More importantly
than the fixme, I think we should be able to move this platform into
ARCH_MULTIPLATFORM. This means moving the contents of the header files
into the places where they are used, and finding a better home for the
pfifo driver.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/5] futex: introduce an optimistic spinning futex

2014-07-22 Thread Peter Zijlstra

On Tue, Jul 22, 2014 at 10:39:17AM +0200, Thomas Gleixner wrote:
> On Tue, 22 Jul 2014, Peter Zijlstra wrote:
> > Anyway, there is one big fail in the entire futex stack that we 'need'
> > to sort some day and that is NUMA. Some people (again database people)
> > explicitly do not use futexes and instead use sysvsem because of this.
> > 
> > The problem with numa futexes is that because they're vaddr based there
> > is no (persistent) node information. You always end up having to fall
> > back to looking in all nodes before you can guarantee there is no
> > matching futex.
> > 
> > One way to achieve it is by extending the futex value to include a node
> > number, but that's obviously a complete ABI break. Then again, it should
> > be pretty straight fwd, since the node number doesn't need to be part of
> > the actual atomic update part, just part of the userspace storage.
> 
> So you want per node hash buckets, right? Fair enough, but how do you
> make sure, that no thread/process on a different node is fiddling with
> that "node bound" futex as well?

You don't and that should work just as well, just slower. But since the
node id is in the futex 'value' we'll always end up in the right
node-hash, even if its a remote one.

So yes, per node hashes, and a persistent futex->node map.

And before people start talking about mempol and using that to bind
memory to nodes and such, remember that private futexes do not have a
vma lookup and therefore mempols are impossible to use.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] ACPI / PM: Make ACPI-based PCI wakeup work for the "freeze" sleep state

2014-07-22 Thread Peter Zijlstra

On Tue, Jul 22, 2014 at 03:23:29AM +0200, Rafael J. Wysocki wrote:

> That turned out to be more challenging than I had thought initially.
> 
> The last version I sent was almost OK, but it had some issues (like it could
> walk the PCI hierarchy before resuming any PCI devices during system resume),
> so a new version follows.  I did my best to avoid introducing any new problems
> with it, but I obviously might overlook something.
> 
> It works for me and doesn't seem to break anything as far as I can say.
> 
> [1/3] Make PM workqueue available for CONFIG_PM_RUNTIME unset.
> [2/3] Rework the handling of ACPI device wakeup notifications.
> [3/3] Enable wakeup GPEs while setting up devices for wakeup during system
>   suspend too.

Doesn't break, doesn't 'work' either. Is there anything I can provide
you with to make this easier? lspci output or anything like that?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] mfd: imanager2: Add Core supports for IT8516/18/28

2014-07-22 Thread Lee Jones

There is no way that you are introducing a 300 line driver and have
nothing to say about it.  Please put a nice descriptive commit log
here when you resubmit.

> Signed-off-by: Wei-Chun Pan 
> ---

I also expect to see a change log here and version information in the
$SUBJECT line of the email.  I think the next one is v3 (or is it
v4?), so the start of the subject should read [PATCH v4 x/y].

>  drivers/mfd/Kconfig  |   6 +
>  drivers/mfd/Makefile |   2 +
>  drivers/mfd/imanager2_core.c | 303 
> +++
>  3 files changed, 311 insertions(+)
>  create mode 100644 drivers/mfd/imanager2_core.c
> 
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index 3383412..48b063f 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -10,6 +10,12 @@ config MFD_CORE
>   select IRQ_DOMAIN
>   default n
>  
> +config MFD_IMANAGER2
> + tristate "Support for Advantech iManager2 EC ICs"
> + select MFD_CORE
> + help
> +   Support for Advantech iManager2 EC ICs
> +
>  config MFD_CS5535
>   tristate "AMD CS5535 and CS5536 southbridge core functions"
>   select MFD_CORE
> diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> index 2851275..10c64ae 100644
> --- a/drivers/mfd/Makefile
> +++ b/drivers/mfd/Makefile
> @@ -166,3 +166,5 @@ obj-$(CONFIG_MFD_RETU)+= retu-mfd.o
>  obj-$(CONFIG_MFD_AS3711) += as3711.o
>  obj-$(CONFIG_MFD_AS3722) += as3722.o
>  obj-$(CONFIG_MFD_STW481X)+= stw481x.o
> +imanager2-objs   := imanager2_core.o imanager2_ec.o
> +obj-$(CONFIG_MFD_IMANAGER2)  += imanager2.o

No need to do this.  Just do:

obj-$(CONFIG_MFD_IMANAGER2) += imanager2_core.o imanager2_ec.o

> diff --git a/drivers/mfd/imanager2_core.c b/drivers/mfd/imanager2_core.c
> new file mode 100644
> index 000..2264d29
> --- /dev/null
> +++ b/drivers/mfd/imanager2_core.c
> @@ -0,0 +1,303 @@
> +/*
> + * imanager2_core.c - MFD core driver of Advantech EC IT8516/18/28
> + * Copyright (C) 2014  Richard Vidal-Dorsch 
> + *
> + * This program is free software: you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 3 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see .
> + */
> +
> +#include 
> +#include 
> +#include 

You need more headers than this.

> +#define DRV_NAME "imanager2"

No need to #define the driver name.

Just use the string, where you need to use it.

> +#define DRV_VERSION  "4.0.1"

Remove this.

> +static struct mfd_cell imanager2_cells[] = {
> + {
> + .name = "imanager2_hwm",
> + },
> + {
> + .name = "imanager2_i2c",
> + },

Put these on a single line.

> +};
> +
> +enum chips {
> + it8516 = 0x8516,
> + it8518 = 0x8518,
> + it8528 = 0x8528,

s/it/CHIP_IT

> +};

Move this to the top.

> +#define EC_CMD_AUTHENTICATION0x30

Why is this seperate from the rest of the #defines?

> +static int imanager2_authentication(struct imanager2 *ec)
> +{
> + u8 tmp;
> + int ret;
> +
> + mutex_lock(&ec->lock);
> +
> + if (inb(EC_IO_PORT_CMD) == 0xFF && inb(EC_IO_PORT_DATA) == 0xFF) {
> + ret = -ENODEV;
> + goto unlock;
> + }
> +
> + if (inb(EC_IO_PORT_CMD) & IO_FLAG_OBF)
> + inb(EC_IO_PORT_DATA);   /* initial OBF */

What's OBF?

> + if (ec_outb_after_ibc0(EC_IO_PORT_CMD, EC_CMD_AUTHENTICATION)) {
> + ret = -ENODEV;
> + goto unlock;
> + }
> +
> + ret = ec_inb_after_obf1(&tmp);
> +
> +unlock:
> + mutex_unlock(&ec->lock);
> +
> + if (ret)
> + return ret;

Remove this.

> + if (tmp != 0x95)

... and change this to:

if (!ret && tmp != 0x95)

But 0x95 should be #defined somewhere.

> + return -ENODEV;
> +
> + return 0;
> +}
> +
> +#define EC_ITE_CHIPID_H8 0x20
> +#define EC_ITE_CHIPID_L8 0x21

Group all of the #defines somewhere.

> +static int imanager2_get_chip_type(struct imanager2 *ec)
> +{
> + mutex_lock(&ec->lock);
> +
> + outb(EC_ITE_CHIPID_H8, EC_SIO_CMD);
> + ec->id = inb(EC_SIO_DATA) << 8;
> + outb(EC_ITE_CHIPID_L8, EC_SIO_CMD);
> + ec->id |= inb(EC_SIO_DATA);
> +
> + mutex_unlock(&ec->lock);
> +
> + switch (ec->id) {
> + case it8516:
> + case it8518:
> + ec->flag = EC_FLAG_IO;
> + break;
> + case it8528:
> + ec->flag |= EC_FLAG_IO_MAILBOX;
> + break;
> + default

Re: [PATCH] x86, TSC: Add a software TSC offset

2014-07-22 Thread Borislav Petkov

On Mon, Jul 21, 2014 at 10:40:39PM -0400, Steven Rostedt wrote:
> Your patch inspired me to write this hack. I was curious to know how
> the TSCs of my boxes were with respect to each other, and wanted to get
> an idea. Maybe there's a better way, but I decided to waste an hour and
> write this hack up.
> 
> Here's what it does. It creates the file /sys/kernel/debug/rdtsc_test,
> and when you read it, it does some whacky things.
> 
> 1) A table is set up with the number of possible CPUs. The cable
> consists of: index, TSC count, CPU.
> 
> 2) A atomic variable is set to the number of online CPUS.
> 
> 3) An IPI is sent to each of the other CPUs to run the test.
> 
> 4) The test decrements the atomic, and then spins until it reaches zero.
> 
> 5) The caller of smp_call_function() then calls the test iself, being
> the last to decrement the counter causing it to go to zero and all CPUs
> then fight for a spinlock.
> 
> 6) When the spin lock is taken, it records which place it was in (order
> of spinlock taken, and records its own TSC. Then it releases the lock.
> 
> 7) It then records in the table where its position is, its TSC counter
> and CPU number.
> 
> 
> Finally, the read will show the results of the table. Looks something
> like this:
> 
> # cat /debug/rdtsc_test 
> 0)  1305910016816  (cpu:5)
> 1)  1305910017550  (cpu:7)
> 2)  1305910017712  (cpu:1)
> 3)  1305910017910  (cpu:6)
> 4)  1305910018042  (cpu:2)
> 5)  1305910018226  (cpu:3)
> 6)  1305910018416  (cpu:4)
> 7)  1305910018540  (cpu:0)
> 
> As long as the TSC counts are in order of the index, the TSC is moving
> forward nicely. If they are not in order, then the TSCs are not in sync.
> 
> Yes, this is a hack, but I think it's a somewhat useful hack.

I think so too, especially if one would like to take a look at the TSCs
and how they're looking like after, say, suspend cycle, a long machine
runtime when there is suspicion that some SMM might've been entered or so...

However, I think you can do all that from luserspace, even though you'd
have to do more dancing like pinning threads to cpus and those threads
would have to synchronize back on rdtsc_start after having read the TSC,
i.e.

while(atomic_read(&rdtsc_start) != num_online_cpus())
cpu_relax();

so that you can make sure they all have read the TSC at least once and
nothing gets reordered.

It might work if I'm not missing something.

Doing it in the kernel is much easier with all that ring0 functionality
present. :-)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/7] arm: dts: omap3-gta04: Add nand support

2014-07-22 Thread Belisko Marek

On Tue, Jul 22, 2014 at 8:24 AM, Tony Lindgren  wrote:
> * Marek Belisko  [140721 14:08]:
>
> Can you please add the descriptions to all the patches?
OK thanks for review. I'll send v2.
> Other than that looks OK to me.
>
> Regards,
>
> Tony

BR,

marek

-- 
as simple and primitive as possible
-
Marek Belisko - OPEN-NANDRA
Freelance Developer

Ruska Nova Ves 219 | Presov, 08005 Slovak Republic
Tel: +421 915 052 184
skype: marekwhite
twitter: #opennandra
web: http://open-nandra.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [STLinux Kernel] [PATCH v3+1 5/5] ARM: DT: STi: STiH416: Add DT node for MiPHY365x

2014-07-22 Thread Maxime Coquelin


Hi Lee,

On 07/11/2014 01:54 PM, Lee Jones wrote:

The MiPHY365x is a Generic PHY which can serve various SATA or PCIe
devices. It has 2 ports which it can use for either; both SATA, both
PCIe or one of each in any configuration.

Acked-by: Mark Rutland 
Acked-by: Alexandre Torgue 
Signed-off-by: Lee Jones 



Added to my queue for v3.17.

Thanks,
Maxime
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 000/170] 3.12.25-stable review

2014-07-22 Thread Jiri Slaby

On 07/19/2014 04:38 AM, Satoru Takeuchi wrote:
> At Fri, 18 Jul 2014 06:47:32 -0700,
> Guenter Roeck wrote:
>>
>> On 07/18/2014 05:16 AM, Jiri Slaby wrote:
>>> On 07/18/2014 02:12 PM, Jiri Slaby wrote:
 This is the start of the stable review cycle for the 3.12.25 release.
 There are 170 patches in this series, all will be posted as a response
 to this one.  If anyone has any issues with these being applied, please
 let me know.

 Responses should be made by Sun Jul 20 12:11:21 2014
>>>
>>> Oh, this should have been two *workdays*, i.e. the deadline is Jul 22.
>>>
>>
>> Build results:
>>  total: 137 pass: 137 fail: 0
>>
>> Qemu tests all passed.
>>
>> Details are available at http://server.roeck-us.net:8010/builders.
>>
>> Guenter
>>
> 
> Plus, it passed my test too.

On 07/21/2014 05:31 PM, Shuah Khan wrote:
> Compiled and booted on my test system. No dmesg regressions.

Thank you all!

-- 
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Performance Impact of skb_segment Security Fix

2014-07-22 Thread Azqa Nadeem

 Hi,

 I am a researcher at EPFL, Switzerland. I study software vulnerabilities
 with the aim of building better tools to protect developers against security
 bugs. Recently the skb_sgement() was patched
 
(http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1fd819ecb90cc9b822cd84d3056ddba315d3340f)
 fixing the CVE-2014-0131 vulnerability
 (http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-0131) in the Linux
 Kernel. I am interested in the performance implications of this patch; could
 you help me answering the following questions:

Do you think the bug fix for skb_segment() function can have any performance
 implications?  If so, how much will the added checks add to the run time of
 the function?
 Is skb_segment() function part of the core functionality of the software?
 What fraction of time is expected to be spent in this function?

 Your answers will help us to better characterize the trade offs between
 performance and security in popular software.

 --
 Regards,
 Azqa Nadeem
 Internee - Dependable Systems Lab
 EPFL, Switzerland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] direct-io: fix uninitialized warning in do_direct_IO()

2014-07-22 Thread Boaz Harrosh

On 07/21/2014 02:36 PM, Christoph Hellwig wrote:
> Looks good to me,
> 
> Reviewed-by: Christoph Hellwig 
> 
> While it looks obvious, did you make sure it passes xfstests, which
> has a lot of direct I/O tests?
> 

Thank you Christoph. OK So finally I did last night. I ran
ext4. Just that I'm not used to run ext4 or any of those
stuff so it took me time. I have the usual 2 failures I get
also from 3.15. So I'd say its good - tested

Thanks
Boaz

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/5] net/netfilter/ipvs/ip_vs_ctl.c: drop argument range check just before the check for equality

2014-07-22 Thread Julian Anastasov


Hello,

On Tue, 22 Jul 2014, Dan Carpenter wrote:

> On Mon, Jul 21, 2014 at 11:01:56PM +0300, Julian Anastasov wrote:
> > @@ -2333,13 +2339,12 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void 
> > __user *user, unsigned int len)
> > struct ip_vs_dest_user_kern udest;
> > struct netns_ipvs *ipvs = net_ipvs(net);
> >  
> > +   BUILD_BUG_ON(sizeof(arg) > 256);
> 
> 256 is off-by-one because u8 ranges from 0-255 so we are never able to
> copy 256 bytes into the "arg" buffer.

I'll change it to >= 256, to catch that we can
not hold such big size in unsigned char [gs]et_arglen and
also as an indication that we have to use allocated buffer
instead of stack.

> > -   if (copylen > 128)
> > +   if (*len < (int) copylen || *len < 0) {
> > +   pr_err("get_ctl: len %d < %u\n", *len, copylen);
> 
> Don't let users flood dmesg.  Just return an error.  (This can be
> triggered by non-root as well).

For now both set and get are privileged operations,
so we can keep it, it can catch if something wrong happens
with the structure sizes.

Regards

--
Julian Anastasov 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND PATCH 1/1] ARM: DT: STi: STiH416: Add DT node for ST's SATA device

2014-07-22 Thread Maxime Coquelin


Hi Lee,

On 07/21/2014 10:32 AM, Lee Jones wrote:

Cc: devicet...@vger.kernel.org
Acked-by: Alexandre Torgue 
Signed-off-by: Lee Jones 
---
  Documentation/devicetree/bindings/ata/ahci-st.txt |  2 +-
  arch/arm/boot/dts/stih416-b2020.dts   |  4 
  arch/arm/boot/dts/stih416-b2020e.dts  |  4 
  arch/arm/boot/dts/stih416.dtsi| 16 
  4 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/ata/ahci-st.txt 
b/Documentation/devicetree/bindings/ata/ahci-st.txt
index 0574a77..9883542 100644
--- a/Documentation/devicetree/bindings/ata/ahci-st.txt
+++ b/Documentation/devicetree/bindings/ata/ahci-st.txt
@@ -21,7 +21,7 @@ Example:
reg = <0xfe38 0x1000>;
interrupts  = ;
interrupt-names = "hostc";
-   phys= <&miphy365x_phy MIPHY_PORT_0 MIPHY_TYPE_SATA>;
+   phys= <&phy_port0 MIPHY_TYPE_SATA>;
phy-names   = "ahci_phy";
resets  = <&powerdown STIH416_SATA0_POWERDOWN>,
  <&softreset STIH416_SATA0_SOFTRESET>;


Patch does not apply because this file does not exist in my tree.
Shouldn't be in a separate patch?

Regards,
Maxime
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 1/2] usb: doc: udc-xilinx: Add devicetree bindings

2014-07-22 Thread Subbaraya Sundeep Bhatta

Add devicetree bindings for Xilinx udc driver.

Signed-off-by: Subbaraya Sundeep Bhatta 
---
Changes for v4:
- renamed xlnx,axi-usb2-device-4.00.a to xlnx,usb2-device-4.00.a
Changes for v3:
- None
Changes for v2:
- replaced xlnx,include-dma with xlnx,has-builtin-dma

 .../devicetree/bindings/usb/udc-xilinx.txt |   18 ++
 1 files changed, 18 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/usb/udc-xilinx.txt

diff --git a/Documentation/devicetree/bindings/usb/udc-xilinx.txt 
b/Documentation/devicetree/bindings/usb/udc-xilinx.txt
new file mode 100644
index 000..47b4e39
--- /dev/null
+++ b/Documentation/devicetree/bindings/usb/udc-xilinx.txt
@@ -0,0 +1,18 @@
+Xilinx USB2 device controller
+
+Required properties:
+- compatible   : Should be "xlnx,usb2-device-4.00.a"
+- reg  : Physical base address and size of the USB2
+ device registers map.
+- interrupts   : Should contain single irq line of USB2 device
+ controller
+- xlnx,has-builtin-dma : if DMA is included
+
+Example:
+   axi-usb2-device@42e0 {
+compatible = "xlnx,usb2-device-4.00.a";
+interrupts = <0x0 0x39 0x1>;
+reg = <0x42e0 0x1>;
+xlnx,has-builtin-dma;
+};
+
-- 
1.7.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] drivers/i2c/busses: use correct type for dma_map/unmap

2014-07-22 Thread Ludovic Desroches

On Mon, Jul 21, 2014 at 11:42:03AM +0200, Wolfram Sang wrote:
> dma_{un}map_* uses 'enum dma_data_direction' not 'enum 
> dma_transfer_direction'.
> 
> Signed-off-by: Wolfram Sang 
Acked-by: Ludovic Desroches 

Thanks Wolfram.

> ---
>  drivers/i2c/busses/i2c-at91.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-at91.c b/drivers/i2c/busses/i2c-at91.c
> index e95f9ba96790..83c989382be9 100644
> --- a/drivers/i2c/busses/i2c-at91.c
> +++ b/drivers/i2c/busses/i2c-at91.c
> @@ -210,7 +210,7 @@ static void at91_twi_write_data_dma_callback(void *data)
>   struct at91_twi_dev *dev = (struct at91_twi_dev *)data;
>  
>   dma_unmap_single(dev->dev, sg_dma_address(&dev->dma.sg),
> -  dev->buf_len, DMA_MEM_TO_DEV);
> +  dev->buf_len, DMA_TO_DEVICE);
>  
>   at91_twi_write(dev, AT91_TWI_CR, AT91_TWI_STOP);
>  }
> @@ -289,7 +289,7 @@ static void at91_twi_read_data_dma_callback(void *data)
>   struct at91_twi_dev *dev = (struct at91_twi_dev *)data;
>  
>   dma_unmap_single(dev->dev, sg_dma_address(&dev->dma.sg),
> -  dev->buf_len, DMA_DEV_TO_MEM);
> +  dev->buf_len, DMA_FROM_DEVICE);
>  
>   /* The last two bytes have to be read without using dma */
>   dev->buf += dev->buf_len - 2;
> -- 
> 2.0.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-i2c" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 3/3] arm64: Add seccomp support

2014-07-22 Thread AKASHI Takahiro

secure_computing() should always be called first in syscall_trace_enter().

If secure_computing() returns -1, we should stop further handling. Then
that system call may eventually fail with a specified return value (errno),
be trapped or the process itself be killed depending on loaded rules.
In these cases, syscall_trace_enter() also returns -1, that results in
skiping a normal syscall handling as well as syscall_trace_exit().

Signed-off-by: AKASHI Takahiro 
---
 arch/arm64/Kconfig   |   14 ++
 arch/arm64/include/asm/seccomp.h |   25 +
 arch/arm64/include/asm/unistd.h  |3 +++
 arch/arm64/kernel/ptrace.c   |5 +
 4 files changed, 47 insertions(+)
 create mode 100644 arch/arm64/include/asm/seccomp.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 3a18571..eeac003 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -32,6 +32,7 @@ config ARM64
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KGDB
+   select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_C_RECORDMCOUNT
select HAVE_DEBUG_BUGVERBOSE
@@ -259,6 +260,19 @@ config ARCH_HAS_CACHE_LINE_SIZE
 
 source "mm/Kconfig"
 
+config SECCOMP
+   bool "Enable seccomp to safely compute untrusted bytecode"
+   ---help---
+ This kernel feature is useful for number crunching applications
+ that may need to compute untrusted bytecode during their
+ execution. By using pipes or other transports made available to
+ the process as file descriptors supporting the read/write
+ syscalls, it's possible to isolate those applications in
+ their own address space using seccomp. Once seccomp is
+ enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
+ and the task is only allowed to execute a few safe syscalls
+ defined by each seccomp mode.
+
 config XEN_DOM0
def_bool y
depends on XEN
diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h
new file mode 100644
index 000..c76fac9
--- /dev/null
+++ b/arch/arm64/include/asm/seccomp.h
@@ -0,0 +1,25 @@
+/*
+ * arch/arm64/include/asm/seccomp.h
+ *
+ * Copyright (C) 2014 Linaro Limited
+ * Author: AKASHI Takahiro 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_SECCOMP_H
+#define _ASM_SECCOMP_H
+
+#include 
+
+#ifdef CONFIG_COMPAT
+#define __NR_seccomp_read_32   __NR_compat_read
+#define __NR_seccomp_write_32  __NR_compat_write
+#define __NR_seccomp_exit_32   __NR_compat_exit
+#define __NR_seccomp_sigreturn_32  __NR_compat_rt_sigreturn
+#endif /* CONFIG_COMPAT */
+
+#include 
+
+#endif /* _ASM_SECCOMP_H */
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index c980ab7..729c155 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -31,6 +31,9 @@
  * Compat syscall numbers used by the AArch64 kernel.
  */
 #define __NR_compat_restart_syscall0
+#define __NR_compat_exit   1
+#define __NR_compat_read   3
+#define __NR_compat_write  4
 #define __NR_compat_sigreturn  119
 #define __NR_compat_rt_sigreturn   173
 
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 100d7d1..e477f6f 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1115,6 +1116,10 @@ asmlinkage int syscall_trace_enter(struct pt_regs *regs)
saved_x0 = regs->regs[0];
saved_x8 = regs->regs[8];
 
+   if (secure_computing(regs->syscallno) == -1)
+   /* seccomp failures shouldn't expose any additional code. */
+   return -1;
+
if (test_thread_flag(TIF_SYSCALL_TRACE))
tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 1/3] arm64: ptrace: reload a syscall number after ptrace operations

2014-07-22 Thread AKASHI Takahiro

Arm64 holds a syscall number in w8(x8) register. Ptrace tracer may change
its value either to:
  * any valid syscall number to alter a system call, or
  * -1 to skip a system call

This patch implements this behavior by reloading that value into syscallno
in struct pt_regs after tracehook_report_syscall_entry() or
secure_computing(). In case of '-1', a return value of system call can also
be changed by the tracer setting the value to x0 register, and so
sys_ni_nosyscall() should not be called.

See also:
42309ab4, ARM: 8087/1: ptrace: reload syscall number after
  secure_computing() check

Signed-off-by: AKASHI Takahiro 
---
 arch/arm64/kernel/entry.S  |2 ++
 arch/arm64/kernel/ptrace.c |   13 +
 2 files changed, 15 insertions(+)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 5141e79..de8bdbc 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -628,6 +628,8 @@ ENDPROC(el0_svc)
 __sys_trace:
mov x0, sp
bl  syscall_trace_enter
+   cmp w0, #-1 // skip syscall?
+   b.eqret_to_user
adr lr, __sys_trace_return  // return address
uxtwscno, w0// syscall number (possibly new)
mov x1, sp  // pointer to regs
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 70526cf..100d7d1 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -21,6 +21,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1109,9 +1110,21 @@ static void tracehook_report_syscall(struct pt_regs 
*regs,
 
 asmlinkage int syscall_trace_enter(struct pt_regs *regs)
 {
+   unsigned long saved_x0, saved_x8;
+
+   saved_x0 = regs->regs[0];
+   saved_x8 = regs->regs[8];
+
if (test_thread_flag(TIF_SYSCALL_TRACE))
tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
 
+   regs->syscallno = regs->regs[8];
+   if ((long)regs->syscallno == ~0UL) { /* skip this syscall */
+   regs->regs[8] = saved_x8;
+   if (regs->regs[0] == saved_x0) /* not changed by user */
+   regs->regs[0] = -ENOSYS;
+   }
+
if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
trace_sys_enter(regs, regs->syscallno);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 0/3] arm64: Add seccomp support

2014-07-22 Thread AKASHI Takahiro

(Please apply this patch after my audit patch in order to avoid some
conflict on arm64/Kconfig.)

This patch enables secure computing (system call filtering) on arm64.
System calls can be allowed or denied by loaded bpf-style rules.
Architecture specific part is to run secure_computing() on syscall entry
and check the result. See [3/3]

Prerequisites are:
 * "arm64: Add audit support" patch

This code is tested on ARMv8 fast model using
 * libseccomp v2.1.1 with modifications for arm64 and verified by its "live"
   tests, 20, 21 and 24.
 * modified version of Kees' seccomp test for 'changing/skipping a syscall'
   behavior

Changes v4 -> v5:
* rebased to v3.16-rc
* add patch [1/3] to allow ptrace to change a system call
  (please note that this patch should be applied even without seccomp.)

Changes v3 -> v4:
* removed the following patch and moved it to "arm64: prerequisites for
  audit and ftrace" patchset since it is required for audit and ftrace in
  case of !COMPAT, too.
  "arm64: is_compat_task is defined both in asm/compat.h and linux/compat.h"

Changes v2 -> v3:
* removed unnecessary 'type cast' operations [2/3]
* check for a return value (-1) of secure_computing() explicitly [2/3]
* aligned with the patch, "arm64: split syscall_trace() into separate
  functions for enter/exit" [2/3]
* changed default of CONFIG_SECCOMP to n [2/3]

Changes v1 -> v2:
* added generic seccomp.h for arm64 to utilize it [1,2/3] 
* changed syscall_trace() to return more meaningful value (-EPERM)
  on seccomp failure case [2/3]
* aligned with the change in "arm64: make a single hook to syscall_trace()
  for all syscall features" v2 [2/3]
* removed is_compat_task() definition from compat.h [3/3]

AKASHI Takahiro (3):
  arm64: ptrace: reload a syscall number after ptrace operations
  asm-generic: Add generic seccomp.h for secure computing mode 1
  arm64: Add seccomp support

 arch/arm64/Kconfig   |   14 ++
 arch/arm64/include/asm/seccomp.h |   25 +
 arch/arm64/include/asm/unistd.h  |3 +++
 arch/arm64/kernel/entry.S|2 ++
 arch/arm64/kernel/ptrace.c   |   18 ++
 include/asm-generic/seccomp.h|   28 
 6 files changed, 90 insertions(+)
 create mode 100644 arch/arm64/include/asm/seccomp.h
 create mode 100644 include/asm-generic/seccomp.h

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 2/3] asm-generic: Add generic seccomp.h for secure computing mode 1

2014-07-22 Thread AKASHI Takahiro

Those values (__NR_seccomp_*) are used solely in secure_computing()
to identify mode 1 system calls. If compat system calls have different
syscall numbers, asm/seccomp.h may override them.

Acked-by: Arnd Bergmann 
Signed-off-by: AKASHI Takahiro 
---
 include/asm-generic/seccomp.h |   28 
 1 file changed, 28 insertions(+)
 create mode 100644 include/asm-generic/seccomp.h

diff --git a/include/asm-generic/seccomp.h b/include/asm-generic/seccomp.h
new file mode 100644
index 000..5e97022
--- /dev/null
+++ b/include/asm-generic/seccomp.h
@@ -0,0 +1,28 @@
+/*
+ * include/asm-generic/seccomp.h
+ *
+ * Copyright (C) 2014 Linaro Limited
+ * Author: AKASHI Takahiro 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _ASM_GENERIC_SECCOMP_H
+#define _ASM_GENERIC_SECCOMP_H
+
+#include 
+
+#if defined(CONFIG_COMPAT) && !defined(__NR_seccomp_read_32)
+#define __NR_seccomp_read_32   __NR_read
+#define __NR_seccomp_write_32  __NR_write
+#define __NR_seccomp_exit_32   __NR_exit
+#define __NR_seccomp_sigreturn_32  __NR_rt_sigreturn
+#endif /* CONFIG_COMPAT && ! already defined */
+
+#define __NR_seccomp_read  __NR_read
+#define __NR_seccomp_write __NR_write
+#define __NR_seccomp_exit  __NR_exit
+#define __NR_seccomp_sigreturn __NR_rt_sigreturn
+
+#endif /* _ASM_GENERIC_SECCOMP_H */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/5] net/netfilter/ipvs/ip_vs_ctl.c: drop argument range check just before the check for equality

2014-07-22 Thread Dan Carpenter

On Tue, Jul 22, 2014 at 11:52:02AM +0300, Julian Anastasov wrote:
> > > - if (copylen > 128)
> > > + if (*len < (int) copylen || *len < 0) {
> > > + pr_err("get_ctl: len %d < %u\n", *len, copylen);
> > 
> > Don't let users flood dmesg.  Just return an error.  (This can be
> > triggered by non-root as well).
> 
>   For now both set and get are privileged operations,
> so we can keep it, it can catch if something wrong happens
> with the structure sizes.

If you have namespaces enabled then it's not *that* privaleged.

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/3] usb: dwc3: add ST dwc3 glue layer to manage dwc3 HC

2014-07-22 Thread Peter Griffin

Hi Lee,

Thanks for reviewing, see my comments inline below: -

On Mon, 07 Jul 2014, Lee Jones wrote:

> On Sat, 05 Jul 2014, Peter Griffin wrote:
> 
> > This patch adds the ST glue logic to manage the DWC3 HC
> > on STiH407 SoC family. It manages the powerdown signal,
> > and configures the internal glue logic and syscfg registers.
> > 
> > Signed-off-by: Giuseppe Cavallaro 
> > Signed-off-by: Peter Griffin 
> > ---
> >  drivers/usb/dwc3/Kconfig   |   9 ++
> >  drivers/usb/dwc3/Makefile  |   1 +
> >  drivers/usb/dwc3/dwc3-st.c | 325 
> > +
> >  3 files changed, 335 insertions(+)
> >  create mode 100644 drivers/usb/dwc3/dwc3-st.c
> > 
> > diff --git a/drivers/usb/dwc3/Kconfig b/drivers/usb/dwc3/Kconfig
> > index 8eb996e..6c85c43 100644
> > --- a/drivers/usb/dwc3/Kconfig
> > +++ b/drivers/usb/dwc3/Kconfig
> > @@ -79,6 +79,15 @@ config USB_DWC3_KEYSTONE
> >   Support of USB2/3 functionality in TI Keystone2 platforms.
> >   Say 'Y' or 'M' here if you have one such device
> >  
> > +config USB_DWC3_ST
> > +   tristate "STMicroelectronics Platforms"
> > +   depends on ARCH_STI && OF
> > +   default USB_DWC3_HOST
> > +   help
> > + STMicroelectronics SoCs with one DesignWare Core USB3 IP
> > + inside (i.e. STiH407).
> > + Say 'Y' or 'M' if you have one such device.
> > +
> >  comment "Debugging features"
> >  
> >  config USB_DWC3_DEBUG
> > diff --git a/drivers/usb/dwc3/Makefile b/drivers/usb/dwc3/Makefile
> > index 10ac3e7..11c9f54 100644
> > --- a/drivers/usb/dwc3/Makefile
> > +++ b/drivers/usb/dwc3/Makefile
> > @@ -33,3 +33,4 @@ obj-$(CONFIG_USB_DWC3_OMAP)   += dwc3-omap.o
> >  obj-$(CONFIG_USB_DWC3_EXYNOS)  += dwc3-exynos.o
> >  obj-$(CONFIG_USB_DWC3_PCI) += dwc3-pci.o
> >  obj-$(CONFIG_USB_DWC3_KEYSTONE)+= dwc3-keystone.o
> > +obj-$(CONFIG_USB_DWC3_ST)  += dwc3-st.o
> > diff --git a/drivers/usb/dwc3/dwc3-st.c b/drivers/usb/dwc3/dwc3-st.c
> > new file mode 100644
> > index 000..2cae9d3
> > --- /dev/null
> > +++ b/drivers/usb/dwc3/dwc3-st.c
> > @@ -0,0 +1,325 @@
> > +/**
> > + * dwc3-st.c Support for dwc3 platform devices on ST Microelectronics 
> > platforms
> > + *
> > + * This is a small platform driver for the dwc3 to provide the glue logic
> > + * to configure the controller. Tested on STi platforms.
> 
> Not sure about the use of the term 'platform driver' here and in the
> title.  We don't normally differentiate.  I can find examples to the
> contrary, but not many.

Ok, removed 'platform' in V3
> 
> > + * Copyright (C) 2014 Stmicroelectronics
> > + *
> > + * Author: Giuseppe Cavallaro 
> > + * Contributors: Aymen Bouattay 
> > + *   Peter Griffin 
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * Inspired by dwc3-omap.c and dwc3-exynos.c.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "core.h"
> > +#include "io.h"
> > +
> > +/* Reg glue registers */
> > +#define USB2_CLKRST_CTRL 0x00
> > +#define aux_clk_en(n) ((n)<<0)
> > +#define sw_pipew_reset_n(n) ((n)<<4)
> > +#define ext_cfg_reset_n(n) ((n)<<8)
> > +#define xhci_revision(n) ((n)<<12)
> 
> These all need reformatting, see CodingStyle - 3.1: Spaces

Ok I have added a space either side of the shift operator and aligned
using tabs.

> 
>   #define xhci_revision(n) ((n) << 12)
> 
> Lining them up with TABs would make them easier to read.

Ok fixed in v3

> 
> Also, I don't think there is a requirement to encapsulate the 'n'.

Ok removed brackets around the 'n'

> 
> > +#define USB2_VBUS_MNGMNT_SEL1 0x2C
> > +/*
> > + * 2'b00 : Override value from Reg 0x30 is selected
> > + * 2'b01 : utmiotg_vbusvalid from usb3_top top is selected
> > + * 2'b10 : pipew_powerpresent from PIPEW instance is selected
> > + * 2'b11 : value is 1'b0
> > + */
> 
> What is this documenting?  Isn't documentation meant to make things
> clearer?  Now I'm just really confused - by the documentation.

It is documenting the bitfields in VBUS_MNGMNT_SEL1 register. I've 
hopefully made it a bit clearer by adding the following comment and
slightly adjusting the descriptions a little.

/*
 * For all fields in USB2_VBUS_MNGMNT_SEL1
 * 2’b00 : Override value from Reg 0x30 is selected
 * 2’b01 : utmiotg_ from usb3_top is selected
 * 2’b10 : pipew_ from PIPEW instance is selected
 * 2’b11 : value is 1'b0
 */

Apart from that it's a standard way to describe bitfields. You can find
some examples in cx231xx-pcb-cfg.h, bnx2x_link.h and cx231xx-avcore.c

> 
> > +#define SEL_OVERRIDE_VBUSVALID(n) ((n)<<0)
> > +#define SEL_OVERRIDE_POWERPRESEN

Re: [PATCH v2] x86, hotplug: fix llc shared map unreleased during cpu hotplug

2014-07-22 Thread Chen, Gong

On Tue, Jul 22, 2014 at 04:04:52PM +0800, Wanpeng Li wrote:
> Subject: [PATCH v2] x86, hotplug: fix llc shared map unreleased during cpu
>  hotplug

See this link:
https://lkml.org/lkml/2014/7/17/78


signature.asc
Description: Digital signature

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1126 matches

Mail list logo