from:"Santosh Shilimkar"

Re: [PATCH] ARM: hw_breakpoint: Enable debug powerdown only if system supports 'has_ossr'

2013-03-17 Thread Santosh Shilimkar

On Friday 15 March 2013 10:30 AM, Will Deacon wrote:
> On Thu, Mar 14, 2013 at 01:08:00PM +0530, Santosh Shilimkar wrote:
>> Will,
> 
> Hi guys,
> 
> I'm out of the office at the moment and have really terrible connectivity,
> so I can't do too much until next week. However, I don't think adding the
> has_ossr check is the right fix for this problem.
> 
>> On Wednesday 13 March 2013 05:59 PM, Lokesh Vutla wrote:
>>> Hi Dietmar,
>>> On Wednesday 13 March 2013 05:35 PM, Dietmar Eggemann wrote:
>>>> On 13/03/13 06:52, Lokesh Vutla wrote:
>>>>> Commit {9a6eb31 ARM: hw_breakpoint: Debug powerdown support for
>>>>> self-hosted
>>>>> debug} introduces debug powerdown support for self-hosted debug.
>>>>> While merging the patch 'has_ossr' check was removed which
>>>>> was needed for hardwares which doesn't support self-hosted debug.
>>>>> Pandaboard (A9) is one such hardware and Dietmar's orginial
>>>>> patch did mention this issue.
>>>>> Without that check on Panda with CPUIDLE enabled, a flood of
>>>>> below messages thrown.
>>>>>
>>>>> [ 3.597930] hw-breakpoint: CPU 0 failed to disable vector catch
>>>>> [ 3.597991] hw-breakpoint: CPU 1 failed to disable vector catch
> 
> Ok, so this means that we've taken an undefined instruction exception while
> trying to reset the debug registers on the PM_EXIT path. Now, the code there
> deals with CPUs that don't have the save/restore registers just fine, so
> that shouldn't have anything to do with this problem, particularly if the
> bit that is tripping us up is related to clearing vector catch.
>
Agree.
 
> Furthermore, I was under the impression that hw_breakpoint did actually
> work on panda, which implies that a cold boot *does* manage to reset the
> registers (can you please confirm this by looking in your dmesg during
> boot?). In that case, it seems as though a PM cycle is powering down a
> bunch of debug logic that was powered up during boot, and then we trip over
> because we can't access the register bank.
> 
Actually it seems to be without PM. Thanks to analysis from Lokesh, the issue
can be seen even with just suspend or cpu hotplug. So cold boot as such is
fine.

> The proper solution to this problem requires us to establish exactly what is
> turning off the debug registers, and then having an OMAP PM notifier to
> enable it again. Assuming this has always been the case, I expect hardware
> debug across PM fails silently with older kernels.
> 
This has been always the case it seems with CPU power cycle.
After the CPU is power cycled, 'DBGAUTHSTATUS' reads '0xaa' rather
than '0xaf' which means 'DBGEN = 0' and hence code fails to enable
monitor mode. This happens on both secure and GP devices and it can not
be patched since the secure code is ROM'ed. We didn't notice so far
because hw_breakpoint support was not default enabled on OMAP till the
multi-platform build.

>> I was also wondering whether we should just warn once rather
>> than continuous warnings in the notifier. Patch is end of the
>> email.
> 
> Could do, but I'd like to see a fix for the real issue before we simply hide
> the warnings :)
> 
Agree here too. As evident above, the feature won't work on OMAP4
devices with PM and hence some solution is needed.

What you think of below ?


>From d74b4264f6a5967b0f7ada96aad77ab0ac30dbed Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar 
Date: Mon, 18 Mar 2013 11:59:04 +0530
Subject: [PATCH] ARM: hw_breakpoints: Check for CPU debug availability before
 enabling it

CPU debug features like hardware break, watchpoints can be used only when
the debug mode is enabled and available for non-secure mode.

Hence check 'DBGAUTHSTATUS.DBGEN' before proceeding to enable the
features.

Thanks to Will for pointers and Lokesh for the analysis of the issue on
OMAP4 where after a CPU power cycle, debug mode gets disabled.

Cc: Will Deacon 

Tested-by: Lokesh Vutla 
Signed-off-by: Santosh Shilimkar 
---
 arch/arm/kernel/hw_breakpoint.c |8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index 96093b7..683a7cf 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -930,6 +930,14 @@ static void reset_ctrl_regs(void *unused)
int i, raw_num_brps, err = 0, cpu = smp_processor_id();
u32 val;
 
+   /* Check if we have access to CPU debug features */
+   ARM_DBG_READ(c7, c14, 6, val);
+   if ((val & 0x1) == 0) {
+   pr_warn_once("CPU %d debug is unavailable\n"

Re: [PATCH] ARM: hw_breakpoint: Enable debug powerdown only if system supports 'has_ossr'

2013-03-18 Thread Santosh Shilimkar

On Monday 18 March 2013 08:37 PM, Will Deacon wrote:
> Hi Santosh,
> 
> On Mon, Mar 18, 2013 at 06:51:30AM +, Santosh Shilimkar wrote:
>> On Friday 15 March 2013 10:30 AM, Will Deacon wrote:
>>> Furthermore, I was under the impression that hw_breakpoint did actually
>>> work on panda, which implies that a cold boot *does* manage to reset the
>>> registers (can you please confirm this by looking in your dmesg during
>>> boot?). In that case, it seems as though a PM cycle is powering down a
>>> bunch of debug logic that was powered up during boot, and then we trip over
>>> because we can't access the register bank.
>>>
>> Actually it seems to be without PM. Thanks to analysis from Lokesh, the issue
>> can be seen even with just suspend or cpu hotplug. So cold boot as such is
>> fine.
> 
> Right, so what you're saying is that monitor-mode hardware debug only works
> until the first pm/suspend/hotplug operation, then it's dead in the water?
> 
>>> The proper solution to this problem requires us to establish exactly what is
>>> turning off the debug registers, and then having an OMAP PM notifier to
>>> enable it again. Assuming this has always been the case, I expect hardware
>>> debug across PM fails silently with older kernels.
>>>
>> This has been always the case it seems with CPU power cycle.
>> After the CPU is power cycled, 'DBGAUTHSTATUS' reads '0xaa' rather
>> than '0xaf' which means 'DBGEN = 0' and hence code fails to enable
>> monitor mode. This happens on both secure and GP devices and it can not
>> be patched since the secure code is ROM'ed. We didn't notice so far
>> because hw_breakpoint support was not default enabled on OMAP till the
>> multi-platform build.
> 
> That really sucks :( Does this affect all OMAP-based boards?
> 
All OMAP4 based boards..

>>>> I was also wondering whether we should just warn once rather
>>>> than continuous warnings in the notifier. Patch is end of the
>>>> email.
>>>
>>> Could do, but I'd like to see a fix for the real issue before we simply hide
>>> the warnings :)
>>>
>> Agree here too. As evident above, the feature won't work on OMAP4
>> devices with PM and hence some solution is needed.
>>
>> What you think of below ?
> 
> Comments inline...
> 
>>
>> From d74b4264f6a5967b0f7ada96aad77ab0ac30dbed Mon Sep 17 00:00:00 2001
>> From: Santosh Shilimkar 
>> Date: Mon, 18 Mar 2013 11:59:04 +0530
>> Subject: [PATCH] ARM: hw_breakpoints: Check for CPU debug availability before
>>  enabling it
>>
>> CPU debug features like hardware break, watchpoints can be used only when
>> the debug mode is enabled and available for non-secure mode.
>>
>> Hence check 'DBGAUTHSTATUS.DBGEN' before proceeding to enable the
>> features.
>>
>> Thanks to Will for pointers and Lokesh for the analysis of the issue on
>> OMAP4 where after a CPU power cycle, debug mode gets disabled.
>>
>> Cc: Will Deacon 
>>
>> Tested-by: Lokesh Vutla 
>> Signed-off-by: Santosh Shilimkar 
>> ---
>>  arch/arm/kernel/hw_breakpoint.c |8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/arch/arm/kernel/hw_breakpoint.c 
>> b/arch/arm/kernel/hw_breakpoint.c
>> index 96093b7..683a7cf 100644
>> --- a/arch/arm/kernel/hw_breakpoint.c
>> +++ b/arch/arm/kernel/hw_breakpoint.c
>> @@ -930,6 +930,14 @@ static void reset_ctrl_regs(void *unused)
>>  int i, raw_num_brps, err = 0, cpu = smp_processor_id();
>>  u32 val;
>>  
>> +/* Check if we have access to CPU debug features */
>> +ARM_DBG_READ(c7, c14, 6, val);
>> +if ((val & 0x1) == 0) {
>> +pr_warn_once("CPU %d debug is unavailable\n", cpu);
>> +cpumask_or(&debug_err_mask, &debug_err_mask, cpumask_of(cpu));
>> +return;
>> +}
> 
> There are a few of problems here:
> 
>   1. That is only checking for non-secure access, which precludes
>  running Linux in secure mode.
> 
We can check bit 4 as well to take care linux running in secure mode.

>   2. DBGAUTHSTATUS accesses are UNPREDICTABLE when the double-lock is
>  set for v7.1 processors.
>
>   3. DBGAUTHSTATUS doesn't exist in ARMv6.
> 
We cans skip the code for these versions like...
if (!ARM_DEBUG_ARCH_V7_ECP14 == get_debug_arch())
goto skip_dbgsts_read;

>   4. CPUs without the security extensions ha

Re: [PATCH] ARM: hw_breakpoint: Enable debug powerdown only if system supports 'has_ossr'

2013-03-18 Thread Santosh Shilimkar

On Monday 18 March 2013 10:36 PM, Will Deacon wrote:
> On Mon, Mar 18, 2013 at 03:46:28PM +0000, Santosh Shilimkar wrote:
>> On Monday 18 March 2013 08:37 PM, Will Deacon wrote:
>>> That really sucks :( Does this affect all OMAP-based boards?
>>>
>> All OMAP4 based boards..
> 
> Brilliant. Is there any way that the secure code can be fixed in future
> products?
> 
Nope. This can only be done with new silicon rev for GP devices which is
not going to happen. For secure devices, some secure patching is possible
but these are not development devices, so not much point.

>>>> +  /* Check if we have access to CPU debug features */
>>>> +  ARM_DBG_READ(c7, c14, 6, val);
>>>> +  if ((val & 0x1) == 0) {
>>>> +  pr_warn_once("CPU %d debug is unavailable\n", cpu);
>>>> +  cpumask_or(&debug_err_mask, &debug_err_mask, cpumask_of(cpu));
>>>> +  return;
>>>> +  }
>>>
>>> There are a few of problems here:
>>>
>>> 1. That is only checking for non-secure access, which precludes
>>>running Linux in secure mode.
>>>
>> We can check bit 4 as well to take care linux running in secure mode.
> 
> It still doesn't help unless we know whether Linux is running secure or
> non-secure.
> 
ok.

>>> 2. DBGAUTHSTATUS accesses are UNPREDICTABLE when the double-lock is
>>>set for v7.1 processors.
>>>
>>> 3. DBGAUTHSTATUS doesn't exist in ARMv6.
>>>
>> We cans skip the code for these versions like...
>>  if (!ARM_DEBUG_ARCH_V7_ECP14 == get_debug_arch())
>>  goto skip_dbgsts_read;
> 
> Sure, I was just pointing out that the code needs fixing for this.
> 
>>> 4. CPUs without the security extensions have DBGAUTHSTATUS.NSE == 0
>>>
>> Which bit is that ? I don't find this bit in Cortex-A9 docs. With all
>> these checks, may be a separate function like 'is_debug_available()'
>> would be better.
> 
> NSE is bit 0 (the one you're checking).
> 
ok. So the subject patch might break those devices.

>>  
>>> 5. Accessing DBGAUTHSTATUS requires DBGEN to be driven high.
>>>Assuming that your pr_warn_once is emitted, this implies that
>>>DBGEN is driven high from cold boot, yet the NSE bit is low,
>>>implying that DBGEN is also low. That's contradictory, so I have
>>>no idea what's going on...
>>>
>> Me neither. The warning is emitted at least.
> 
> Any chance you could follow up with your firmware/hardware guys about this
> please? I'd really like to understand how we end up in this state in case we
> can do something in the architecture to stop it from happening in future.
> 
I will check on this part and update you when I hear from them.

>>> Apart from that, it's fine!
>>>
>> If you agree, I can update patch accordingly.
>> BTW, do you have any better trick to take care of
>> above scenraio ?
> 
> Well, we could just add the warn_once prints but that doesn't stop debug
> from breaking after the first pm/suspend/hotplug operation.
> 
Probably we should drop the $subject patch approach and use warn_once
at least for current rc. Ofcourse it doesn't help to get working
hw_breakpoint support. Patch end of the email with warn_once.

Regards,
Santosh

>From 6611d48eb5571e3e094c7a9c2479e652b37d35e3 Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar 
Date: Tue, 19 Mar 2013 11:53:41 +0530
Subject: [PATCH] ARM: hw_breakpoints: Use warn_once to avoid debug message
 flood from reset_ctrl_regs()

CPU debug features like hardware break, watchpoints can be used only when
the debug mode is enabled and available. Unfortunately on OMAP4 based device,
after a CPU power cycle, the debug feature gets disabled which leads to
a flood of messages coming from reset_ctrl_regs() which gets called on
every CPU_PM_EXIT with CPUidle enabled.

So make use of warn_once() so that system is usable.

Thanks to Will for pointers and Lokesh for the analysis of the issue.

Cc: Will Deacon 

Tested-by: Lokesh Vutla 
Signed-off-by: Santosh Shilimkar 
---
 arch/arm/kernel/hw_breakpoint.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index 96093b7..5dc1aa6 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -966,7 +966,7 @@ static void reset_ctrl_regs(void *unused)
}
 
if (err) {
-   pr_warning("CPU %d debug is powered down!\n", cpu);
+   pr_warn_once("CPU %d debug is pow

Re: [PATCH 0/2] omap serial fix.

2013-03-18 Thread Santosh Shilimkar

On Monday 18 March 2013 04:24 PM, Sourav Poddar wrote:
> The first patch is a rearrangement of a macro "OMAP_MAX_HSUART_PORTS"
> to a header file so that it can be used in other file. 
> The second patch fixes the wakeup from uart issue while using 
> "no_console_suspend" in the
> bootargs.
> 
> These patches are developed on 3.8 custom kernel containing omap5 
> supend/resume support.
> 
> Cc: Santosh Shilimkar 
Thanks Sourav for sorting out the issue. With update of changelog
suggested by Kevin on patch 1, Feel free to add,
Acked-Tested-by: Santosh Shilimkar 


> Sourav Poddar (2):
>   driver: serial-omap: move max uart count into generic header file.
>   arm: mach-omap2: prevent UART console idle on suspend while using
> "no_console_suspend"
> 
>  arch/arm/mach-omap2/omap_device.c |   36 
> -
>  drivers/tty/serial/omap-serial.c  |2 -
>  include/linux/platform_data/serial-omap.h |1 +
>  3 files changed, 36 insertions(+), 3 deletions(-)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 03/10] gpio: gpio-omap.c: fix checkpatch error

2013-03-20 Thread Santosh Shilimkar

On Wednesday 20 March 2013 05:45 PM, Laurent Navet wrote:
> Fix :
>  gpio/gpio-omap.c:697: ERROR: space required before the open parenthesis '('
> 
> Signed-off-by: Laurent Navet 
> ---
Acked-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 03/10] gpio: gpio-omap.c: fix checkpatch error

2013-03-20 Thread Santosh Shilimkar

On Wednesday 20 March 2013 05:54 PM, Santosh Shilimkar wrote:
> On Wednesday 20 March 2013 05:45 PM, Laurent Navet wrote:
>> Fix :
>>  gpio/gpio-omap.c:697: ERROR: space required before the open parenthesis '('
>>
>> Signed-off-by: Laurent Navet 
>> ---
> Acked-by: Santosh Shilimkar 
> 
Minor suggestion.
You might want to use $subject like "coding style fixes".
No strong opinion though.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv2 0/2] add omap mcspi device tree data

2013-02-04 Thread Santosh Shilimkar


On Monday 04 February 2013 02:21 PM, Sourav Poddar wrote:

Patch series adds omap5 evm mcspi nodes and pinctrl
data in omap5.dtsi and omap5-evm.dts files.

Felipe Balbi (1):
   arm: dts: omap5: add SPI devices to OMAP5 DeviceTree file

Sourav Poddar (1):
   arm: dts: omap5-evm: Add mcspi data

  arch/arm/boot/dts/omap5-evm.dts |   46 +++
  arch/arm/boot/dts/omap5.dtsi|   40 +
  2 files changed, 86 insertions(+), 0 deletions(-)


Acked-by: Santosh Shilimkar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 1/3] ARM: vmregion: remove vmregion code entirely

2013-02-05 Thread Santosh Shilimkar


On Tuesday 05 February 2013 06:01 AM, Joonsoo Kim wrote:

Now, there is no user for vmregion.
So remove it.

Acked-by: Nicolas Pitre 
Signed-off-by: Joonsoo Kim 

diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile
index 8a9c4cb..4e333fa 100644
--- a/arch/arm/mm/Makefile
+++ b/arch/arm/mm/Makefile
@@ -6,7 +6,7 @@ obj-y   := dma-mapping.o extable.o 
fault.o init.o \
   iomap.o

  obj-$(CONFIG_MMU) += fault-armv.o flush.o idmap.o ioremap.o \
-  mmap.o pgd.o mmu.o vmregion.o
+  mmap.o pgd.o mmu.o

  ifneq ($(CONFIG_MMU),y)
  obj-y += nommu.o
diff --git a/arch/arm/mm/vmregion.c b/arch/arm/mm/vmregion.c
deleted file mode 100644
index a631016..000
--- a/arch/arm/mm/vmregion.c
+++ /dev/null
@@ -1,205 +0,0 @@

You might want to use 'git format-patch -D'
which will just generate one line for a deleted file.

Regards,
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/3] introduce static_vm for ARM-specific static mapped area

2013-02-05 Thread Santosh Shilimkar


On Tuesday 05 February 2013 06:01 AM, Joonsoo Kim wrote:

In current implementation, we used ARM-specific flag, that is,
VM_ARM_STATIC_MAPPING, for distinguishing ARM specific static mapped area.
The purpose of static mapped area is to re-use static mapped area when
entire physical address range of the ioremap request can be covered
by this area.

This implementation causes needless overhead for some cases.
For example, assume that there is only one static mapped area and
vmlist has 300 areas. Every time we call ioremap, we check 300 areas for
deciding whether it is matched or not. Moreover, even if there is
no static mapped area and vmlist has 300 areas, every time we call
ioremap, we check 300 areas in now.

If we construct a extra list for static mapped area, we can eliminate
above mentioned overhead.
With a extra list, if there is one static mapped area,
we just check only one area and proceed next operation quickly.

In fact, it is not a critical problem, because ioremap is not frequently
used. But reducing overhead is better idea.

Another reason for doing this work is for removing vm_struct list management,
entirely. For more information, look at the following link.
http://lkml.org/lkml/2012/12/6/184



[..]



Joonsoo Kim (3):
   ARM: vmregion: remove vmregion code entirely
   ARM: ioremap: introduce an infrastructure for static mapped area
   ARM: mm: use static_vm for managing static mapped areas

  arch/arm/mm/Makefile   |2 +-
  arch/arm/mm/ioremap.c  |  135 +--
  arch/arm/mm/mm.h   |   12 +++
  arch/arm/mm/mmu.c  |   34 
  arch/arm/mm/vmregion.c |  205 
  arch/arm/mm/vmregion.h |   31 
  6 files changed, 123 insertions(+), 296 deletions(-)
  delete mode 100644 arch/arm/mm/vmregion.c
  delete mode 100644 arch/arm/mm/vmregion.h


Nice Clean-up. I tested this series on OMAP which uses few static
mappings. Feel free to add,

Tested-by: Santosh Shilimkar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 4/6] ARM: dts: omap5: add dwc3 omap dt data

2013-02-05 Thread Santosh Shilimkar


On Tuesday 05 February 2013 02:16 PM, kishon wrote:

Hi,

On Tuesday 05 February 2013 02:08 PM, Felipe Balbi wrote:

On Tue, Feb 05, 2013 at 01:59:26PM +0530, kishon wrote:

On Sunday 27 January 2013 01:17 AM, Sergei Shtylyov wrote:

Hello.

On 25-01-2013 15:11, Kishon Vijay Abraham I wrote:


Add dwc3 omap glue data to the omap5 dt data file. The information
about
the dt node added here is available @
Documentation/devicetree/bindings/usb/omap-usb.txt



Signed-off-by: Kishon Vijay Abraham I 
---
  arch/arm/boot/dts/omap5.dtsi |   11 +++
  1 file changed, 11 insertions(+)



diff --git a/arch/arm/boot/dts/omap5.dtsi
b/arch/arm/boot/dts/omap5.dtsi
index 5f59bf2..1703a72 100644
--- a/arch/arm/boot/dts/omap5.dtsi
+++ b/arch/arm/boot/dts/omap5.dtsi
@@ -513,6 +513,17 @@
  ti,type = <2>;
  };

+omap_dwc3@4a02 {
+compatible = "ti,dwc3";
+ti,hwmods = "usb_otg_ss";
+reg = <0x4a02 0x1ff>;


Shoudn't the "reg" length be 0x200 here? It's length, not limit.


I think 0x1ff is correct. I got the data from hwmod data.


hwmod is utterly wrong. Looking at TRM, it says the size here is 64KiB
(0x1), so is the size for dwc3 itself. Please don't blindly trust
hwmod, make sure you read data from TRM ;-)


hmm..ok. But it has only 17 registers :-D


As Felipe said, it should be 0x200. And if you are
interested in lesser space, there is no need to map entire 64 KB
address space if it isn't being used.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3 4/4] arm: Add generic timer broadcast support

2013-02-07 Thread Santosh Shilimkar


Mark,

On Thursday 07 February 2013 03:34 PM, Mark Rutland wrote:

Hi Stephen,

Sorry about this; I'm to blame for the bug.

On Wed, Feb 06, 2013 at 08:51:43PM +, Stephen Warren wrote:

On 01/14/2013 10:05 AM, Mark Rutland wrote:

Implement timer_broadcast for the arm architecture, allowing for the use
of clock_event_device_drivers decoupled from the timer tick broadcast
mechanism.


Mark, this patch is now in next-20130206 and causes a crash during boot
on Tegra. The reason appears to be because of:


diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c



@@ -524,7 +524,6 @@ static void __cpuinit percpu_timer_setup(void)
struct clock_event_device *evt = &per_cpu(percpu_clockevent, cpu);

evt->cpumask = cpumask_of(cpu);
-   evt->broadcast = smp_timer_broadcast;


After that change, evt->broadcast is never assigned, and hence is NULL.
Yet elsewhere in kernel/time/tick-broadcast.c it's used unconditionally:

static void tick_do_broadcast(struct cpumask *mask)
...
if (!cpumask_empty(mask)) {
...
td = &per_cpu(tick_cpu_device, cpumask_first(mask));
td->evtdev->broadcast(mask);

Now perhaps the Tegra timer driver simply isn't being set up correctly,
so the bug is there... But the only other place I can find where
->broadcast is assigned is in tick_device_uses_broadcast() which only
does it for "non-functional" timers, which doesn't apply to Tegra's timer.



The intent of 12ad100046: "clockevents: Add generic timer broadcast function"
was to setup the broadcast function both for non-functional/dummy timers and
those that stop in low-power states (CLOCK_EVT_FEAT_C3STOP). I missed the
CLOCK_EVT_FEAT_C3STOP case.

I am not sure this exactly the case. Because in my testing, the C3STOP 
path was exercised already. And if the C3STOP is used then notifiers

calls are expected for switching of clock-events to broadcast mode.

And dummy broad-cast hook should come into picture only if the per CPU
local timer clock-event are not registered.

I just tried "next-20130207" on OMAP and I see the broad-cast mechanism
works and didn't observe any crash.

--
Tick Device: mode: 1
Broadcast device
Clock Event Device: gp_timer
 max_delta_ns:   131071523464981
 min_delta_ns:   91552
 mult:   70369
 shift:  31
 mode:   3
 next_event: 89984375000 nsecs
 set_next_event: omap2_gp_timer_set_next_event
 set_mode:   omap2_gp_timer_set_mode
 event_handler:  tick_handle_oneshot_broadcast
 retries:0

tick_broadcast_mask: 0003
tick_broadcast_oneshot_mask: 


Tick Device: mode: 1
Per CPU device: 0
Clock Event Device: local_timer
 max_delta_ns:   10737418240
 min_delta_ns:   1000
 mult:   858993459
 shift:  31
 mode:   3
 next_event: 12525000 nsecs
 set_next_event: twd_set_next_event
 set_mode:   twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:346

Tick Device: mode: 1
Per CPU device: 1
Clock Event Device: local_timer
 max_delta_ns:   10737418240
 min_delta_ns:   1000
 mult:   858993459
 shift:  31
 mode:   3
 next_event: 89921875000 nsecs
 set_next_event: twd_set_next_event
 set_mode:   twd_set_mode
 event_handler:  hrtimer_interrupt
 retries:258

#

--


I believe the patch below will fix this for Tegra and any other platforms where
broadcast is required in low power states.


Am not sure you really need that patch unless and until am missing a
scenario in my test.

Stephan,

Do we have crash log seen on Tegra ? Which tegra CPUIDLE driver
is being used when crash is seen ?

Regards
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3 4/4] arm: Add generic timer broadcast support

2013-02-07 Thread Santosh Shilimkar


On Thursday 07 February 2013 10:21 PM, Stephen Warren wrote:

On 02/07/2013 04:40 AM, Santosh Shilimkar wrote:

[...]


Stephan,

Do we have crash log seen on Tegra ? Which tegra CPUIDLE driver
is being used when crash is seen ?


The crash log is below. It's simply because td->evt->broadcast is NULL
because it wasn't ever set.


Thanks for the log. Its clear from the log that broadcast event wasn't
set.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3 4/4] arm: Add generic timer broadcast support

2013-02-07 Thread Santosh Shilimkar


Mark,

On Thursday 07 February 2013 10:47 PM, Mark Rutland wrote:

[..]


I believe the patch below will fix this for Tegra and any other platforms where
broadcast is required in low power states.


From the Stephan's crash the issue is pretty clear and your patch make
sense. Some how I didn't manage to reproduce the issue on my board and
hence assumed everything is just fine. Will have a look at it later why
it didn't explode since it was so obvious.

Sorry for the noise.

Regards
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2] dmaengine: omap-dma: Start DMA without delay for cyclic channels

2013-04-03 Thread Santosh Shilimkar

On Wednesday 03 April 2013 04:47 PM, Peter Ujfalusi wrote:
> cyclic DMA is only used by audio which needs DMA to be started without a
> delay.
> If the DMA for audio is started using the tasklet we experience random
> channel switch (to be more precise: channel shift).
> 
> Reported-by: Peter Meerwald 
> Signed-off-by: Peter Ujfalusi 
> ---
> Hi Russell,
> 
> Instead of removing the tasklet we can identify the DMA channel used by audio
> based on the cyclic flag of the channel.
> I think this can be used as a short term solution to fix the audio channel 
> shift
> issue and later when we have the dynamic DMA channel allocation we can adjust
> the code.
> 
> Regards,
> Peter
> 
>  drivers/dma/omap-dma.c | 20 ++--
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/dma/omap-dma.c b/drivers/dma/omap-dma.c
> index 2ea3d7e..ec3fc4f 100644
> --- a/drivers/dma/omap-dma.c
> +++ b/drivers/dma/omap-dma.c
> @@ -282,12 +282,20 @@ static void omap_dma_issue_pending(struct dma_chan 
> *chan)
>  
>   spin_lock_irqsave(&c->vc.lock, flags);
>   if (vchan_issue_pending(&c->vc) && !c->desc) {
If you add "!c->cyclic" in above if then you can avoid
indentation change and just have else for cyclic case.

> - struct omap_dmadev *d = to_omap_dma_dev(chan->device);
> - spin_lock(&d->lock);
> - if (list_empty(&c->node))
> - list_add_tail(&c->node, &d->pending);
> - spin_unlock(&d->lock);
> - tasklet_schedule(&d->task);
> + /*
> +  * c->cyclic is used only by audio and in this case the DMA need
> +  * to be started without delay.
> +  */
> + if (!c->cyclic) {
> + struct omap_dmadev *d = to_omap_dma_dev(chan->device);
> + spin_lock(&d->lock);
> + if (list_empty(&c->node))
> + list_add_tail(&c->node, &d->pending);
> + spin_unlock(&d->lock);
> + tasklet_schedule(&d->task);
> + } else {
> + omap_dma_start_desc(c);
> + }
>   }
>   spin_unlock_irqrestore(&c->vc.lock, flags);
>  }
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2] dmaengine: omap-dma: Start DMA without delay for cyclic channels

2013-04-03 Thread Santosh Shilimkar

On Wednesday 03 April 2013 05:22 PM, Peter Ujfalusi wrote:
> On 04/03/2013 01:24 PM, Santosh Shilimkar wrote:
>> On Wednesday 03 April 2013 04:47 PM, Peter Ujfalusi wrote:
>>> cyclic DMA is only used by audio which needs DMA to be started without a
>>> delay.
>>> If the DMA for audio is started using the tasklet we experience random
>>> channel switch (to be more precise: channel shift).
>>>
>>> Reported-by: Peter Meerwald 
>>> Signed-off-by: Peter Ujfalusi 
>>> ---
>>> Hi Russell,
>>>
>>> Instead of removing the tasklet we can identify the DMA channel used by 
>>> audio
>>> based on the cyclic flag of the channel.
>>> I think this can be used as a short term solution to fix the audio channel 
>>> shift
>>> issue and later when we have the dynamic DMA channel allocation we can 
>>> adjust
>>> the code.
>>>
>>> Regards,
>>> Peter
>>>
>>>  drivers/dma/omap-dma.c | 20 ++--
>>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/dma/omap-dma.c b/drivers/dma/omap-dma.c
>>> index 2ea3d7e..ec3fc4f 100644
>>> --- a/drivers/dma/omap-dma.c
>>> +++ b/drivers/dma/omap-dma.c
>>> @@ -282,12 +282,20 @@ static void omap_dma_issue_pending(struct dma_chan 
>>> *chan)
>>>  
>>> spin_lock_irqsave(&c->vc.lock, flags);
>>> if (vchan_issue_pending(&c->vc) && !c->desc) {
>> If you add "!c->cyclic" in above if then you can avoid
>> indentation change and just have else for cyclic case.
> 
> It can not be embedded there because of the existing tests. How would we
> handle the case when c->desc is _not_ NULL and c->cyclic is false? We would
> need to test again in else, but we can not do this for the
> vchan_issue_pending(&c->vc).
> 
right. Thanks for clarifying it.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH/Resend 2/2] arm: mach-omap2: prevent UART console idle on suspend while using "no_console_suspend"

2013-04-05 Thread Santosh Shilimkar

On Friday 05 April 2013 12:38 PM, Sourav Poddar wrote:
> Hi Kevin,
> On Wednesday 03 April 2013 11:18 PM, Kevin Hilman wrote:
>> Sourav Poddar  writes:
>>
>>> Hi Kevin,
>>> On Wednesday 20 March 2013 05:36 PM, Sourav Poddar wrote:
>>>> Realised the list  to whom the patch was send got dropped. Ccing
>>>> them all..
>>>> On Wednesday 20 March 2013 05:18 PM, Sourav Poddar wrote:
>>>>> Hi Kevin,
>>>>> On Tuesday 19 March 2013 12:24 AM, Kevin Hilman wrote:
>>>>>> Sourav Poddar   writes:
>>>>>>
>>>>>>> With dt boot, uart wakeup after suspend is non functional on
>>>>>>> omap4/5 while using
>>>>>>> "no_console_suspend" in the bootargs. With "no_console_suspend"
>>>>>>> used, od->flags
>>>>>>> should be ORed with "OMAP_DEVICE_NO_IDLE_ON_SUSPEND", thereby not
>>>>>>> allowing the console
>>>>>>> to idle in the suspend path. For non-dt case, this was taken care
>>>>>>> by platform data.
>>>>>>>
>>>>>>> Tested on omap5430evm, omap4430sdp.
>>>>>>>
>>>>>>> Cc: Santosh Shilimkar
>>>>>>> Cc: Felipe Balbi
>>>>>>> Cc: Rajendra nayak
>>>>>>> Signed-off-by: Sourav Poddar
>>>>>> This patch creates a dependency between omap_device (generic,
>>>>>> device-independent code) and a specific driver (UART.)
>>>>>>
>>>>>> If you need to do something like this that's DT boot specific, then
>>>>>> we probably need some late initcall in serial.c to handle this.
>>>>>> It does
>>>>>> not belong in omap_device.
>>>>>>
>>>>> The following function "omap_device_disable_idle_on_suspend(pdev)"
>>>>> should only
>>>>> be called once the omap device has been build, which in the case of
>>>>> device tree is
>>>>> done in omap_device.c file. Moreover, the above call should be
>>>>> executed conditionally
>>>>> and should depend on the following two parameter.
>>>>>
>>>>> [1]  a. Whether "no_console_suspend" is set and
>>>>>   b.  the device build is a console uart.
>>>>>
>>>>> When I look closely into the serial.c file, I realised that
>>>>> "core_initcall(omap_serial_early_init)" gets called irrespective
>>>>> of dt/non dt boot and will take care of most of the stuff(checking
>>>>> whether
>>>>> "no_console_suspend" is used and which uart is used as a console
>>>>> uart) which the
>>>>> $subject patch is proposing.
>>>>>
>>>>> But the problem is that we need to exchange the parsed information
>>>>> from serial.c to the omap_device file for the condtional execution of
>>>>> "omap_device_disable_idle_on_suspend"
>>>>>
>>>>> In this case,
>>>>> from "serial.c" we need
>>>>>  1. no_console_suspend = true
>>>>>  2. strcpy(console_name, oh_name), where oh_name corresponds to
>>>>> the console uart.
>>>>>
>>>>> then in "omap_device.c" do
>>>>>  if (no_console_suspend&&  !strcmp(oh->name, console_name))
>>>>>  omap_device_disable_idle_on_suspend(pdev);
>>>>>
>>>>> Please correct if I am understanding it incorrectly.
>>>>>
>>>>> If the above understanding looks good to you, is there a way we can
>>>>> make this
>>>>> exchange of information happen between serial.c and omap_device.c file?
>>> Any input on this?
>>> As I explained earlier, that there is a need to parse information in
>>> serial.c and use that in
>>> omap_device.c only after the device is build.
>> As I explained earlier, any device specific hacks inside omap_device
>> should be a red flag that something has gone wrong.
>>
>> How about fixing the UART driver/core to not runtime suspend if
>> no_console_suspend is given?
>>
>> Then we can get rid of this no_idle_on_suspend hack all together since
>> UART is the only remaining user.
>>
> 
> Yes, that can be done.
> 
> I cooked up an experimental p

Re: One of these things (CONFIG_HZ) is not like the others..

2013-01-28 Thread Santosh Shilimkar


Jon,

On Tuesday 29 January 2013 05:31 AM, John Stultz wrote:

On 01/27/2013 10:08 PM, Santosh Shilimkar wrote:

On Tuesday 22 January 2013 08:35 PM, Santosh Shilimkar wrote:

On Tuesday 22 January 2013 08:21 PM, Russell King - ARM Linux wrote:

On Tue, Jan 22, 2013 at 03:44:03PM +0530, Santosh Shilimkar wrote:


[..]


Thanks for expanding it. It is really helpful.


And I think further discussion is pointless until such research has
been
done (or someone who _really_ knows the time keeping/timer/sched code
inside out comments.)


Fully agree about experimentation to re-asses the drift.
 From what I recollect from past, few OMAP customers did
report the time drift issue and that is how the switch
from 100 --> 128 happened.

Anyway I have added the suggested task to my long todo list.


So I tried to see if any time drift with HZ = 100 on OMAP. I ran the
setup for 62 hours and 27 mins with time synced up once with NTP server.
I measure about ~174 millisecond drift which is almost noise considering
the observed duration was ~22482 milliseconds.


So 174ms drift doesn't sound great, as < 2ms (often much less - though
that depends on how close the server is) can be expected with NTP.
Although its not clear how you were measuring: Did you see a max 174ms
offset while trying to sync with NTP? Was that offset shortly after
starting NTP or after NTP converged down?

To avoid the server latency, we didn't do continuous sync. The time was 
synced in the beginning and after 62.5 hours (#ntpd -qg) and the drift

of about 174 ms was observed. As you said this could be because of
server sync time along with probably some addition from system calls
from #ntpd. As mentioned, the other run with HZ = 128 which started
15 hours 20 mins is already showing about 24 mS drift now. I will
let it run for couple of more days just to have similar duration run.

Regards,
santosh



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/2] ARM: kernel: update cpuinfo to print CPU model name

2013-01-29 Thread Santosh Shilimkar


On Wednesday 30 January 2013 04:42 AM, Ruslan Bilovol wrote:

Hi,

On Tue, Jan 29, 2013 at 6:08 PM, Russell King - ARM Linux
 wrote:

On Tue, Jan 29, 2013 at 05:54:24PM +0200, Ruslan Bilovol wrote:

CPU implementer : 0x41
CPU name: OMAP4470 ES1.0 HS


Sigh.  No.  Look at what you're doing - look carefully at the above.

"CPU implementer" - 0x41.  That's A.  For ARM Ltd.  ARM Ltd implemented
this CPU.  Did ARM Ltd really implement OMAP4470 ?  I think TI would be
very upset if that were to be the case.


Yes, it would be very surprisingly :)



So no, OMAP4470 is _NOT_ a CPU.  It is a SoC.  The CPU inside the SoC is
a collection of ARM Ltd Cortex A9 _CPUs_.

See?  Please, learn what a CPU is as opposed to a SoC.


Completely agree with you. I will fix this


Thank god you agreed to drop your current approach. Please elaborate
what you are going to fix and also state what user-space features
changes from OMAP4460 to OMAP4470.

Regards,
Santosh



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v2 0/2] ARM: update cpuinfo to print SoC model name

2013-01-29 Thread Santosh Shilimkar


On Wednesday 30 January 2013 06:08 AM, Ruslan Bilovol wrote:

The following patches update cpuinfo to print SoC
model name for ARM.
The first patch exactly makes needed changes for ARM
architecture and adds a common approach to show SoC name.
Second patch uses this approach for OMAP4 SoCs (as live
example).

Looks like there were few attempts to do similar
changes to cpuinfo without any luck (had stuck
on review) so this functionality is still not
in the kernel yet.
In this patch series the update to cpuinfo is very
short (10 lines) and easy.

Comments are welcome as usual


As most of the people already commented, your purpose
behind the series isn't clear so please give examples
where and why you need  device type information from
user-space.

Regards,
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] arm: dts: omap4-sdp: Add I2c pinctrl data

2013-01-30 Thread Santosh Shilimkar


On Wednesday 30 January 2013 02:13 PM, Kumar, Anil wrote:

Hi Sourav,

On Wed, Jan 30, 2013 at 12:10:18, Poddar, Sourav wrote:

Hi Luciano,
On Wednesday 30 January 2013 11:55 AM, Luciano Coelho wrote:

Hi Sourav,

On Mon, 2013-01-28 at 16:47 +0530, Sourav Poddar wrote:

Booting 3.8-rc4 om omap 4430sdp results in the following error

omap_i2c 4807.i2c: did not get pins for i2c error: -19
[1.024261] omap_i2c 4807.i2c: bus 0 rev0.12 at 100 kHz
[1.030181] omap_i2c 48072000.i2c: did not get pins for i2c error: -19
[1.037384] omap_i2c 48072000.i2c: bus 1 rev0.12 at 400 kHz
[1.043762] omap_i2c 4806.i2c: did not get pins for i2c error: -19
[1.050964] omap_i2c 4806.i2c: bus 2 rev0.12 at 100 kHz
[1.056823] omap_i2c 4807a000.i2c: did not get pins for i2c error: -19
[1.064025] omap_i2c 4807a000.i2c: bus 3 rev0.12 at 400 kHz

This happens because omap4 dts file is not adapted to use i2c through pinctrl
framework. Populating i2c pinctrl data to get rid of the error.

Tested on omap4430 sdp with 3.8-rc4 kernel.

Signed-off-by: Sourav Poddar 
Reported-by: Santosh Shilimkar 
---

Could you do the same thing for panda? I'm getting the same kind of
errors with it:


omap4 uses pinctrl-single driver for pinmux with DT. Currently
pinctrl-single driver is getting up after I2C driver. So I2c cannot
use pinctrl. The below patch solve this issue

http://www.gossamer-threads.com/lists/linux/kernel/1669067

Can you try with this ? it may solve it.


OMAP i2c driver already takes care of -EPROBE_DEFER. The issue
as you see from the log is not probe failure but missing the
pin information in DT blob. And thats what patch does.

Regards
santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] cpufreq: Remove unnecessary use of policy->shared_type

2013-01-31 Thread Santosh Shilimkar


On Friday 01 February 2013 12:10 PM, Viresh Kumar wrote:

policy->shared_type field was added only for SoCs with ACPI support:

commit 3b2d99429e3386b6e2ac949fc72486509c8bbe36
Author: Venkatesh Pallipadi 
Date:   Wed Dec 14 15:05:00 2005 -0500

 P-state software coordination for ACPI core

 http://bugzilla.kernel.org/show_bug.cgi?id=5737

Many non-ACPI systems are filling this field by mistake, which makes its usage
confusing. Lets clean it.

Signed-off-by: Viresh Kumar 
Cc: Linus Walleij 
Cc: Stephen Warren 
Cc: Shawn Guo 
Cc: Santosh Shilimkar 
---

I haven't looked at the cpufreq code recently but remember
that it was needed to ensure that all the CPU which
share clock/voltage gets updated (affected cpus) on
freq change. The CPUs which needs SW co-ordination, should
have this flag enabled and OMAP was falling in that category.

May be I miss-understood its use, but can you confirm that
SW co-ordination logic continues to work without this flag ?

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] cpufreq: Remove unnecessary use of policy->shared_type

2013-01-31 Thread Santosh Shilimkar


On Friday 01 February 2013 12:43 PM, Viresh Kumar wrote:

On 1 February 2013 12:17, Santosh Shilimkar  wrote:

I haven't looked at the cpufreq code recently but remember
that it was needed to ensure that all the CPU which
share clock/voltage gets updated (affected cpus) on
freq change. The CPUs which needs SW co-ordination, should
have this flag enabled and OMAP was falling in that category.


Freq change are done by the target routines of platform cpufreq drivers
and they do something like:

for_each_cpu(freqs.cpu, policy->cpus)
cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE);

The only requirement from cpufreq core is to keep cpus sharing clock
in policy->cpus.


I am not talking about just notifiers. This is for external users who
has subscribed for notifiers. The point is whether the core CPUFReq
gets updated without that flag for all affected CPU.


May be I miss-understood its use, but can you confirm that
SW co-ordination logic continues to work without this flag ?


I believe it should work. It works for the systems i worked on:

SPEAr13xx: Dual Cortex A9
ARM TC2: two clusters of A15s and A7s.


I will give a try some time next week on OMAP.

Regards
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] cpufreq: Remove unnecessary use of policy->shared_type

2013-02-01 Thread Santosh Shilimkar


On Friday 01 February 2013 01:32 PM, Viresh Kumar wrote:

On 1 February 2013 13:03, Santosh Shilimkar  wrote:

I am not talking about just notifiers. This is for external users who
has subscribed for notifiers. The point is whether the core CPUFReq
gets updated without that flag for all affected CPU.


Yes, its safe. Follow this thread, yesterday i explained this to Tomasz Figa:

http://www.spinics.net/lists/arm-kernel/msg221629.html


That part was very clear to me Viresh. Anyway thanks for the link.
From what I read so far, it might just work but I would want to
try it out before acking the approach.

Regards,
Santosh



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] cpufreq: Remove unnecessary use of policy->shared_type

2013-02-01 Thread Santosh Shilimkar


Viresh,

On Friday 01 February 2013 02:22 PM, Santosh Shilimkar wrote:

On Friday 01 February 2013 01:32 PM, Viresh Kumar wrote:

On 1 February 2013 13:03, Santosh Shilimkar 
wrote:

I am not talking about just notifiers. This is for external users who
has subscribed for notifiers. The point is whether the core CPUFReq
gets updated without that flag for all affected CPU.


Yes, its safe. Follow this thread, yesterday i explained this to
Tomasz Figa:

http://www.spinics.net/lists/arm-kernel/msg221629.html


That part was very clear to me Viresh. Anyway thanks for the link.
 From what I read so far, it might just work but I would want to
try it out before acking the approach.


You are correct. Sorry for oversight on your initial point about the
usage of the flag. When I added that flag, I just went by the
description thinking the cpufreq core booking and stat updates
use the flag. Its not the case.

Thanks for the fix. For the patch,
Acked-by: Santosh Shilimkar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] mmc: omap_hsmmc: convert to dma_request_slave_channel_compat()

2013-02-01 Thread Santosh Shilimkar


On Saturday 02 February 2013 02:31 AM, Matt Porter wrote:

Convert dmaengine channel requests to use
dma_request_slave_channel_compat(). This supports platforms booting
with or without DT populated.

Signed-off-by: Matt Porter 
Acked-by: Tony Lindgren 
---

Acked-by: Santosh Shilimkar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] mmc: omap_hsmmc: add generic DMA request support to the DT binding

2013-02-01 Thread Santosh Shilimkar


On Saturday 02 February 2013 02:31 AM, Matt Porter wrote:

The binding definition is based on the generic DMA request binding.

Signed-off-by: Matt Porter 
Acked-by: Tony Lindgren 
---


Acked-by: Santosh Shilimkar

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 00/14] DMA Engine support for AM33XX

2013-01-24 Thread Santosh Shilimkar


On Friday 25 January 2013 02:19 AM, Matt Porter wrote:

On Thu, Jan 24, 2013 at 05:12:05AM +, Shilimkar, Santosh wrote:

On Thursday 24 January 2013 02:19 AM, Matt Porter wrote:

On Wed, Jan 23, 2013 at 04:37:55PM +0530, Santosh Shilimkar wrote:

Matt,

On Wednesday 16 January 2013 02:02 AM, Matt Porter wrote:



[...]



Thanks for the catch, I'll add this in with your tested line as well for
the series.


BTW, since I'm dropping mmc support from this series, I'll move your
patch into a separate series with only the mmc pieces.


Fine by me
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: One of these things (CONFIG_HZ) is not like the others..

2013-01-27 Thread Santosh Shilimkar


On Tuesday 22 January 2013 08:35 PM, Santosh Shilimkar wrote:

On Tuesday 22 January 2013 08:21 PM, Russell King - ARM Linux wrote:

On Tue, Jan 22, 2013 at 03:44:03PM +0530, Santosh Shilimkar wrote:

Sorry for not being clear enough. On OMAP, 32KHz is the only clock which
is always running(even during low power states) and hence the clock
source and clock event have been clocked using 32KHz clock. As mentioned
by RMK, with 32768 Hz clock and HZ = 100, there will be always an
error of 0.1 %. This accuracy also impacts the timer tick interval.
This was the reason, OMAP has been using the HZ = 128.


Ok.  Let's look at this.  As far as time-of-day is concerned, this
shouldn't really matter with the clocksource/clockevent based system
that we now have (where *important point* platforms have been converted
over.)

Any platform providing a clocksource will override the jiffy-based
clocksource.  The measurement of time-of-day passing is now based on
the difference in values read from the clocksource, not from the actual
tick rate.

Anything _not_ providing a clock source will be reliant on jiffies
incrementing, which in turn _requires_ one timer interrupt per jiffies
at a known rate (which is HZ).

Now, that's the time of day, what about jiffies?  Well, jiffies is
incremented based on a certain number of nsec having passed since the
last jiffy update.  That means the code copes with dropped ticks and
the like.

However, if your actual interrupt rate is close to the desired HZ, then
it can lead to some interesting effects (and noise):

- if the interrupt rate is slightly faster than HZ, then you can end up
   with updates being delayed by 2x interrupt rate.
- if the interrupt rate is slightly slower than HZ, you can occasionally
   end up with jiffies incrementing by two.
- if your interrupt rate is dead on HZ, then other system noise can come
   into effect and you may get maybe zero, one or two jiffy increments
per
   interrupt.

(You have to think about time passing in NS, where jiffy updates should
be vs where the timer interrupts happen.)  See tick_do_update_jiffies64()
for the details.

The timer infrastructure is jiffy based - which includes scheduling where
the scheduler does not use hrtimers.  That means a slight discrepency
between HZ and the actual interrupt rate can cause around 1/HZ jitter.
That's a matter of fact due to how the code works.

So, actually, I think the accuracy of HZ has much overall effect
_provided_
a platform provides a clocksource to the accuracy of jiffy based timers
nor timekeeping.  For those which don't, the accuracy of the timer
interrupt to HZ is very important.

(This is just based on reading some code and not on practical
experiments - I'd suggest some research of this is done, trying HZ=100
on OMAP's 32kHz timers, checking whether there's any drift, checking
how accurately a single task can be woken from various select/poll/epoll
delays, and checking whether NTP works.)


Thanks for expanding it. It is really helpful.


And I think further discussion is pointless until such research has been
done (or someone who _really_ knows the time keeping/timer/sched code
inside out comments.)


Fully agree about experimentation to re-asses the drift.
 From what I recollect from past, few OMAP customers did
report the time drift issue and that is how the switch
from 100 --> 128 happened.

Anyway I have added the suggested task to my long todo list.


So I tried to see if any time drift with HZ = 100 on OMAP. I ran the
setup for 62 hours and 27 mins with time synced up once with NTP server.
I measure about ~174 millisecond drift which is almost noise considering
the observed duration was ~22482 milliseconds.

Am re-running the setup with HZ = 128 for similar time frame to see if
the minimal drift observed goes away.

Once through that, I will send a patch to update the OMAP to use
HZ = 100 and possibly get rid of the custom OMAP HZ config.

Regards,
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-24 Thread Santosh Shilimkar


On Saturday 23 February 2013 12:22 AM, Thomas Gleixner wrote:

On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:

On Fri, Feb 22, 2013 at 03:03:02PM +, Thomas Gleixner wrote:

On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote:

On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote:

Now we could make use of that and avoid going deep idle just to come
back right away via the IPI. Unfortunately the notification thingy has
no return value, but we can fix that.

To confirm that theory, could you please try the hack below and add
some instrumentation (trace_printk)?


Applied, and it looks like that's exactly why the warning triggers, at least
on the platform I am testing on which is a dual-cluster ARM testchip.


I too confirm that the warnings cause is same.


There is a still time window though where the CPU (the IPI target) can get
back to idle (tick_broadcast_pending still not set) before the CPU target of
the broadcast has a chance to run tick_handle_oneshot_broadcast (and set
tick_broadcast_pending), or am I missing something ?


Well, the tick_broadcast_pending bit is uninteresting if the
force_broadcast bit is set. Because if that bit is set we know for
sure, that we got woken with the cpu which gets the broadcast timer
and raced back to idle before the broadcast handler managed to
send the IPI.


Gah, my bad sorry, I mixed things up. I thought

tick_check_broadcast_pending()

was checking against the tick_broadcast_pending mask not

tick_force_broadcast_mask


Yep, that's a misnomer. I just wanted to make sure that my theory is
correct. I need to think about the real solution some more.

We have two alternatives:

1) Make the clockevents_notify function have a return value.

2) Add something like the hack I gave you with a proper name.

The latter has the beauty, that we just need to modify the platform
independent idle code instead of going down to every callsite of the
clockevents_notify thing.


I agree that 2 is better alternative to avoid multiple changes.
Whichever alternative you choose, will be happy to test it :)

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] ARM: timer-sp: Set dynamic irq affinity

2013-02-26 Thread Santosh Shilimkar


On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote:

From: Viresh Kumar 

When a cpu goes to a deep idle state where its local timer is shutdown, it
notifies the time frame work to use the broadcast timer instead.

Unfortunately, the broadcast device could wake up any CPU, including an idle one
which is not concerned by the wake up at all.


Broad-cast device will only open the CPU for which the timer IRQ
affined to. And infact with subject series the affinity also is
updated for the CPU which owns the last timer expiry event.


This implies, in the worst case, an idle CPU will wake up to send an IPI to
another idle cpu.

This patch fixes this for ARM platforms using timer-sp, by setting
CLOCK_EVT_FEAT_DYNIRQ feature.

Signed-off-by: Viresh Kumar 
Signed-off-by: Daniel Lezcano 
---

What am I missing here ?

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] ARM: timer-sp: Set dynamic irq affinity

2013-02-26 Thread Santosh Shilimkar


On Wednesday 27 February 2013 10:29 AM, Viresh Kumar wrote:

On 27 February 2013 10:26, Santosh Shilimkar  wrote:

On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote:


From: Viresh Kumar 

When a cpu goes to a deep idle state where its local timer is shutdown, it
notifies the time frame work to use the broadcast timer instead.

Unfortunately, the broadcast device could wake up any CPU, including an
idle one
which is not concerned by the wake up at all.


Broad-cast device will only open the CPU for which the timer IRQ
affined to. And infact with subject series the affinity also is
updated for the CPU which owns the last timer expiry event.

What am I missing here ?


Dynamic affinity will work only if the following flag is set for a
clock_event_device: CLOCK_EVT_FEAT_DYNIRQ, otherwise wakeup
would happen on the cpu to which static affinity was set to.


I should have looked at the patches in order first :)
Sorry for the noise.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] time : pass broadcast parameter

2013-02-26 Thread Santosh Shilimkar


On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote:

The broadcast timer could be passed as parameter to the function
instead of using again tick_broadcast_device.evtdev which was
previously used in the caller function.

Signed-off-by: Daniel Lezcano 
---

The change doesn't buy us as such even after looking
at next patch which tries to use bc. No strong opinion
though.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/4] time : set broadcast irq affinity

2013-02-26 Thread Santosh Shilimkar


On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote:

When a cpu goes to a deep idle state where its local timer is shutdown,
it notifies the time frame work to use the broadcast timer instead.

Unfortunately, the broadcast device could wake up any CPU, including an
idle one which is not concerned by the wake up at all.

This implies, in the worst case, an idle CPU will wake up to send an IPI
to another idle cpu.

This patch solves this by setting the irq affinity to the cpu concerned
by the nearest timer event, by this way, the CPU which is wake up is
guarantee to be the one concerned by the next event and we are safe with
unnecessary wakeup for another idle CPU.

As the irq affinity is not supported by all the archs, a flag is needed
to specify which clocksource can handle it.


Minor. Can mention the flag name as well here "CLOCK_EVT_FEAT_DYNIRQ"


Signed-off-by: Daniel Lezcano 
---
  include/linux/clockchips.h   |1 +
  kernel/time/tick-broadcast.c |   39 ---
  2 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
index 6634652..c256cea 100644
--- a/include/linux/clockchips.h
+++ b/include/linux/clockchips.h
@@ -54,6 +54,7 @@ enum clock_event_nofitiers {
   */
  #define CLOCK_EVT_FEAT_C3STOP 0x08
  #define CLOCK_EVT_FEAT_DUMMY  0x10
+#define CLOCK_EVT_FEAT_DYNIRQ  0x20


Please add some comments about the usage of the flag.


  /**
   * struct clock_event_device - clock event device descriptor
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 6197ac0..1f7b4f4 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -406,13 +406,36 @@ struct cpumask *tick_get_broadcast_oneshot_mask(void)
return to_cpumask(tick_broadcast_oneshot_mask);
  }

-static int tick_broadcast_set_event(struct clock_event_device *bc,
+/*
+ * Set broadcast interrupt affinity
+ */
+static void tick_broadcast_set_affinity(struct clock_event_device *bc, int cpu)
+{

Better is just make second parameter as cpu_mask rather than CPU
cpu number. Its a semantic of affinity hook which you can easily retain.


+   if (!(bc->features & CLOCK_EVT_FEAT_DYNIRQ))
+   return;
+
+   if (cpumask_equal(bc->cpumask, cpumask_of(cpu)))
+   return;
+
+   bc->cpumask = cpumask_of(cpu);

You can avoid the cpumask_of() couple of times above.


+   irq_set_affinity(bc->irq, bc->cpumask);
+}
+
+static int tick_broadcast_set_event(struct clock_event_device *bc, int cpu,
ktime_t expires, int force)
  {
+   int ret;
+
if (bc->mode != CLOCK_EVT_MODE_ONESHOT)
clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);

-   return clockevents_program_event(bc, expires, force);
+   ret = clockevents_program_event(bc, expires, force);
+   if (ret)
+   return ret;
+
+   tick_broadcast_set_affinity(bc, cpu);

In case you go by cpumask paramater, then above can be just
tick_broadcast_set_affinity(bc, cpumask_of(cpu));


+
+   return 0;
  }

  int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
@@ -441,7 +464,7 @@ static void tick_handle_oneshot_broadcast(struct 
clock_event_device *dev)
  {
struct tick_device *td;
ktime_t now, next_event;
-   int cpu;
+   int cpu, next_cpu;

raw_spin_lock(&tick_broadcast_lock);
  again:
@@ -454,8 +477,10 @@ again:
td = &per_cpu(tick_cpu_device, cpu);
if (td->evtdev->next_event.tv64 <= now.tv64)
cpumask_set_cpu(cpu, to_cpumask(tmpmask));
-   else if (td->evtdev->next_event.tv64 < next_event.tv64)
+   else if (td->evtdev->next_event.tv64 < next_event.tv64) {
next_event.tv64 = td->evtdev->next_event.tv64;
+   next_cpu = cpu;
+   }
}

/*
@@ -478,7 +503,7 @@ again:
 * Rearm the broadcast device. If event expired,
 * repeat the above
 */
-   if (tick_broadcast_set_event(dev, next_event, 0))
+   if (tick_broadcast_set_event(dev, next_cpu, next_event, 0))
goto again;
}
raw_spin_unlock(&tick_broadcast_lock);
@@ -521,7 +546,7 @@ void tick_broadcast_oneshot_control(unsigned long reason)
cpumask_set_cpu(cpu, tick_get_broadcast_oneshot_mask());
clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
if (dev->next_event.tv64 < bc->next_event.tv64)
-   tick_broadcast_set_event(bc, dev->next_event, 
1);
+   tick_broadcast_set_event(bc, cpu, 
dev->next_event, 1);

Since you have embedded the irq_affinity() in above function,
the IRQ affinity for bc->irq will remain to the last CPU
on which the inte

Re: [PATCH 0/4] time: dynamic irq affinity

2013-02-26 Thread Santosh Shilimkar


On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote:

When a cpu goes to a deep idle state where its local timer is shutdown,
it notifies the time framework to use the broadcast timer instead.

Unfortunately, the broadcast device could wake up any CPU, including an
idle one which is not concerned by the wake up at all.

This implies, in the worst case, an idle CPU will wake up to send an IPI
to another idle cpu.

This patch solves this by setting the irq affinity to the cpu concerned
by the nearest timer event, by this way, the CPU which is wake up is
guarantee to be the one concerned by the next event and we are safe with
unnecessary wakeup for another idle CPU.

As the irq affinity is not supported by all the archs, a flag is needed
to specify which clocksource can handle it.


Not completely related to this series but there is another
issue where this local timer not wakeup capable hurts. So
far we are discussing only the timer related future events
which are known and can be programmed with broadcast device.

But think of the scenario's where we need to send asynchronous
IPIs to CPUs to do some work. e.g generic_exec_single().
If the CPU which is suppose to be available after IPI call
is in deep low power state, then the IPI(implemented on ARM)
isn't effective. In CPU off idle modes, a GIC SGI will not wake
the CPU and hence a special wakeup is needed to bring out
those CPUs out of idle. This special wakeup is handled
by broad-cast timer in case of CPUIDLE.

In short what I mean is, you need to have IPI which
can wakeup CPUs from any deep idle power state to address
above. Has anybody thought of this one ?

Regards,
Santosh
P.S: Time and again it proves that making the local timer wakeup
capable solves the issue.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 0/8] ARM: AM43 (OMAP2+) boot support

2013-02-19 Thread Santosh Shilimkar


On Monday 18 February 2013 05:05 PM, Afzal Mohammed wrote:

(Resending, since it seems, LAKML doesn't accept patches with subject
prefix only as "RFC", but requires "PATCH" prefix also)

Hi,

This series adds minimal support to boot Linux on platforms having
AM43 based SoC's.

This is being sent as an RFC to seek opinion about modification in
twd to register percpu local timer clock event for scheduler tick in
the case of one core SMP.

AM43 SoC's are based on ARM Cortex-A9. It is an ARM Cortex-A9 SMP
configuration with one core (not uniprocessor configuration). AM43 is
similar to AM335x in it's peripheral capabilities, with many of the
peripheral register mapping's similar like that of uart.


After looking at the specs, you don't need the SMP mode since ACP
isn't being used.



AM43 is in pre-silicon stage and currently there are no public
documents.

This series has been tested on a pre-silicon platform that emulates
AM43 SoC, changes proposed here are minimal - to get it booting.
Kernel was directly run without the help of bootloader - Images were
directly loaded onto a pre-initialized RAM and ARM registers updated
as required for booting.

Changes have been made over linux-next (next-20130213) with three "OF"
related reverts (which otherwise causes problem in other platforms
also) and compiled with omap2plus_defconfig. Multiplatform option was
enabled, while most of CONFIG options were deselected for a faster
boot. Beagle bone boots as earlier with these changes.

An interesting observation is that it may be possible to boot this
platform to console without any platform specific modification to
proper Kernel (by that I mean excluding DT sources) using Arnd's,

"[PATCH,RFC] default machine descriptor for multiplatform",

along with a "CLOCKSOURCE_OF_DECLARE" for smp twd.


TWD use for AM437x is also limited because these times stops in
low power sates and there you will need broad-cast mechanism which
again more of SMP machine feature.

So I suggest to use the wakeup timer(GPT1) clock-event instead
of local timer for the mentioned reason.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 2/8] ARM: twd: register clock event for 1 core SMP

2013-02-19 Thread Santosh Shilimkar


On Monday 18 February 2013 05:07 PM, Afzal Mohammed wrote:

Register percpu local timer for scheduler tick in the case of one core
SMP configuration. In other cases - secondary cpu's as well as boot
cpu's having more than one core, this is being registered as per
existing boot flow, with a difference that they happens after delay
calibration. Registering the clock for tick in case of one core should
be done before Kernel calibrates delay (this is required to boot,
unless local timer is the only one registered for tick). Registering
twd local timer at init_time (which platforms are doing now) helps
achieve that with the proposed change.

This helps in an almost booting Kernel (minimal) by only relying on
ARM parts for an A9 one core SMP.

Signed-off-by: Afzal Mohammed 
---

As mentioned in cover-letter, I don't think we have good
reasoning to make TWD to work with UP configuration. Even
you fix the timer code, there are more cascaded dependencies
which is not worth the effort.

Regards,
Samtosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 3/8] ARM: twd: clock rate from DT (if no DT clk tree)

2013-02-19 Thread Santosh Shilimkar


On Monday 18 February 2013 05:07 PM, Afzal Mohammed wrote:

Add an optional property to find clock-frequency from DT. This helps
as a fallback mechanism in case there is no representation of clock
tree in DT.

Signed-off-by: Afzal Mohammed 
---

This won't work always because twd clock is CPU dependent and changes
with CPU clock speed. And more importantly to get CPUFreq working, you
need to provide the correct clock-node to TWD library. Refer OMAP4 clock
data for reference.

Regards,
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 1/8] ARM: localtimer: return percpu clkevt on register

2013-02-19 Thread Santosh Shilimkar


On Monday 18 February 2013 05:06 PM, Afzal Mohammed wrote:

Return percpu clock event on local timer register. It is the boot cpu
that calls this and it can use the returned percpu clock event to
register a clock event in the case of SMP configuration with one core.

SMP configuration with 1 core is UP :-)
Jokes apart as said already, lets see whether we really need it.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 4/8] ARM: am33xx: ll debug config help

2013-02-19 Thread Santosh Shilimkar


On Monday 18 February 2013 05:07 PM, Afzal Mohammed wrote:

Selecting DEBUG_AM33XXUART1 routes low level debug messages to first
UART instance of AM335x based SoC's. This selection is valid for
upcoming AM43 based SoC's too. Make this information available upon
configuring.

Signed-off-by: Afzal Mohammed 
---
  arch/arm/Kconfig.debug | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index aca..b717b78 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -542,6 +542,9 @@ choice

config DEBUG_AM33XXUART1
bool "AM33XX UART1"
+   help
+ Route low level debug messages to first uart instance
+ for boards based on am335 and am43 family of SoC's

config DEBUG_ZOOM_UART
bool "Zoom2/3 UART"


With DT, IIRC DEBUGLL is broken. So did you hack debug-macro.S
to get the earlyprintk working ?

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 5/8] ARM: OMAP2+: am43: Kconfig

2013-02-19 Thread Santosh Shilimkar


On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote:

Add Kconfig option for AM43 family of SoC's, these are ARM Cortex A9
based (SMP configuration with 1 core).

Signed-off-by: Afzal Mohammed 
---
  arch/arm/mach-omap2/Kconfig | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index 49ac3df..683fbaa 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -141,6 +141,17 @@ config SOC_AM33XX
select MULTI_IRQ_HANDLER
select COMMON_CLK

+config SOC_AM43
+   bool "TI AM43"
+   depends on ARCH_OMAP2PLUS
+   default y
+   select CPU_V7
+   select HAVE_SMP

You don't need this

+   select LOCAL_TIMERS if SMP

This one as well


+   select MULTI_IRQ_HANDLER

And this one too I guess.

Rest is fine.

regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 6/8] ARM: OMAP2+: am43: basic dt support

2013-02-19 Thread Santosh Shilimkar


On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote:

Describe minimal DT boot machine details for AM43 based SoC's. AM43
family of SoC's are ARM Cortex-A9 based with one core in SMP
configuration. Low level debug could be achieved by selecting
DEBUG_AM33XXUART1. To boot AM43 SoC, this change is sufficient w.r.t
Kernel (considering the fact that strictly speaking DT sources does
not classify as a part of Kernel). Here private timer of the one and
only core is being used as clock event (as well as, as time source).

Signed-off-by: Afzal Mohammed 
---

As discussed already, lets just call this as Cortex-A9 single core
machine to avoid confusion.


  arch/arm/mach-omap2/board-generic.c | 18 ++
  1 file changed, 18 insertions(+)

diff --git a/arch/arm/mach-omap2/board-generic.c 
b/arch/arm/mach-omap2/board-generic.c
index 0274ff7..e083f08 100644
--- a/arch/arm/mach-omap2/board-generic.c
+++ b/arch/arm/mach-omap2/board-generic.c
@@ -15,7 +15,10 @@
  #include 
  #include 
  #include 
+#include 

+#include 
+#include 
  #include 

  #include "common.h"
@@ -182,3 +185,18 @@ DT_MACHINE_START(OMAP5_DT, "Generic OMAP5 (Flattened Device 
Tree)")
.restart= omap44xx_restart,
  MACHINE_END
  #endif
+
+#ifdef CONFIG_SOC_AM43
+static const char *am43_boards_compat[] __initdata = {
+   "ti,am43",
+   NULL,
+};
+
+DT_MACHINE_START(AM43_DT, "Generic AM43 (Flattened Device Tree)")
+   .map_io = debug_ll_io_init,
+   .init_irq   = irqchip_init,

With Arnds patch [1], the above can be dropped..


+   .init_machine   = omap_generic_init,
+   .init_time  = twd_local_timer_of_register,

This one needs to take care of other timers as well from start. So
better to take that approach from start.

Regards,
Santosh

[1] https://patchwork.kernel.org/patch/2074871/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 8/8] ARM: dts: am43-pre-silicon support

2013-02-19 Thread Santosh Shilimkar


On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote:

AM43 SoC is in pre-silicon stage, meanwhile it has been modelled in
a pre-silicon platform. To validate and boot Linux in pre-silicon
platform that emulates an AM43 SoC, add DT build support.

As bootloader is not used, bootargs is passed through DT.

Note: This would be replaced by an original board support.

Signed-off-by: Afzal Mohammed 
---
  arch/arm/boot/dts/Makefile |  3 ++-
  arch/arm/boot/dts/am43-pre-silicon.dts | 31 +++
  2 files changed, 33 insertions(+), 1 deletion(-)
  create mode 100644 arch/arm/boot/dts/am43-pre-silicon.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 94d88b9..b434344 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -124,7 +124,8 @@ dtb-$(CONFIG_ARCH_OMAP2PLUS) += omap2420-h4.dtb \
omap5-evm.dtb \
am335x-evm.dtb \
am335x-evmsk.dtb \
-   am335x-bone.dtb
+   am335x-bone.dtb \
+   am43-pre-silicon.dtb
  dtb-$(CONFIG_ARCH_ORION5X) += orion5x-lacie-ethernet-disk-mini-v2.dtb
  dtb-$(CONFIG_ARCH_PRIMA2) += prima2-evb.dtb
  dtb-$(CONFIG_ARCH_U8500) += snowball.dtb \
diff --git a/arch/arm/boot/dts/am43-pre-silicon.dts 
b/arch/arm/boot/dts/am43-pre-silicon.dts
new file mode 100644
index 000..b9c6297
--- /dev/null
+++ b/arch/arm/boot/dts/am43-pre-silicon.dts


Well the pre-silicon platform and the SOC are very
close and at least the support you are adding here is
exactly same. So lets just use am437x.dtb or something
like that.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 8/8] ARM: dts: am43-pre-silicon support

2013-02-19 Thread Santosh Shilimkar


On Tuesday 19 February 2013 04:22 PM, Mohammed, Afzal wrote:

Hi Santosh,

On Tue, Feb 19, 2013 at 16:05:22, Shilimkar, Santosh wrote:

On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote:



AM43 SoC is in pre-silicon stage, meanwhile it has been modelled in
a pre-silicon platform. To validate and boot Linux in pre-silicon
platform that emulates an AM43 SoC, add DT build support.



Note: This would be replaced by an original board support.



-   am335x-bone.dtb
+   am335x-bone.dtb \
+   am43-pre-silicon.dtb



Well the pre-silicon platform and the SOC are very
close and at least the support you are adding here is
exactly same. So lets just use am437x.dtb or something
like that.


SoC support is already added in patch 7/8. This is board (which doesn't
exist now) support, hence a pre-silicon temporary one to validate it.


I mean we can call it am437x-xyxboard.dtb already considering the data
here can be re-used. Boot-args can be used from default kernel config
with CONFIG_CMDLINE_FORCE.

No strong opinion if you still insist to have a pre-silicon dtb.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 5/8] ARM: OMAP2+: am43: Kconfig

2013-02-19 Thread Santosh Shilimkar


On Tuesday 19 February 2013 04:26 PM, Felipe Balbi wrote:

On Tue, Feb 19, 2013 at 03:57:07PM +0530, Santosh Shilimkar wrote:

On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote:

Add Kconfig option for AM43 family of SoC's, these are ARM Cortex A9
based (SMP configuration with 1 core).

Signed-off-by: Afzal Mohammed 
---
  arch/arm/mach-omap2/Kconfig | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index 49ac3df..683fbaa 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -141,6 +141,17 @@ config SOC_AM33XX
select MULTI_IRQ_HANDLER
select COMMON_CLK

+config SOC_AM43
+   bool "TI AM43"
+   depends on ARCH_OMAP2PLUS
+   default y
+   select CPU_V7
+   select HAVE_SMP

You don't need this


actually, this is needed for CONFIG_SMP_ON_UP


Ahh.. I missed that. Thanks Felipe for pointing it
out.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 4/8] ARM: am33xx: ll debug config help

2013-02-19 Thread Santosh Shilimkar


On Tuesday 19 February 2013 04:00 PM, Mohammed, Afzal wrote:

Hi Santosh,

On Tue, Feb 19, 2013 at 15:55:59, Shilimkar, Santosh wrote:


With DT, IIRC DEBUGLL is broken. So did you hack debug-macro.S
to get the earlyprintk working ?


No, on linux-next, ll debug works properly.


Indeed. Tony fixed that now. Some how I missed
this patch on the list.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 8/8] ARM: dts: am43-pre-silicon support

2013-02-19 Thread Santosh Shilimkar


On Tuesday 19 February 2013 04:33 PM, Mohammed, Afzal wrote:

Hi Santosh,

On Tue, Feb 19, 2013 at 16:30:13, Shilimkar, Santosh wrote:

On Tuesday 19 February 2013 04:22 PM, Mohammed, Afzal wrote:



SoC support is already added in patch 7/8. This is board (which doesn't
exist now) support, hence a pre-silicon temporary one to validate it.


I mean we can call it am437x-xyxboard.dtb already considering the data
here can be re-used. Boot-args can be used from default kernel config
with CONFIG_CMDLINE_FORCE.

No strong opinion if you still insist to have a pre-silicon dtb.


This patch would be replaced by original board, once it is known.
This was included to make a working complete series and if someone
wants (internally) to test the series as is.


Ohh. I assumed it is for merge as well.
Thanks for clarification.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH, RFC 2/8] ARM: twd: register clock event for 1 core SMP

2013-02-19 Thread Santosh Shilimkar


On Tuesday 19 February 2013 05:44 PM, Felipe Balbi wrote:

On Tue, Feb 19, 2013 at 03:44:14PM +0530, Santosh Shilimkar wrote:

On Monday 18 February 2013 05:07 PM, Afzal Mohammed wrote:

Register percpu local timer for scheduler tick in the case of one core
SMP configuration. In other cases - secondary cpu's as well as boot
cpu's having more than one core, this is being registered as per
existing boot flow, with a difference that they happens after delay
calibration. Registering the clock for tick in case of one core should
be done before Kernel calibrates delay (this is required to boot,
unless local timer is the only one registered for tick). Registering
twd local timer at init_time (which platforms are doing now) helps
achieve that with the proposed change.

This helps in an almost booting Kernel (minimal) by only relying on
ARM parts for an A9 one core SMP.

Signed-off-by: Afzal Mohammed 
---

As mentioned in cover-letter, I don't think we have good
reasoning to make TWD to work with UP configuration. Even
you fix the timer code, there are more cascaded dependencies
which is not worth the effort.


if CONFIG_SMP_ON_UP is enabled, smp_twd.c can still be compiled, right ?


Yep though just from deps pesrpective TWD is made available for ARM SMP
machines as below

config HAVE_ARM_TWD
bool
depends on SMP

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [resend] Timer broadcast question

2013-02-20 Thread Santosh Shilimkar


On Tuesday 19 February 2013 11:51 PM, Daniel Lezcano wrote:

On 02/19/2013 07:10 PM, Thomas Gleixner wrote:

On Tue, 19 Feb 2013, Daniel Lezcano wrote:

I am working on identifying the different wakeup sources from the
interrupts and I have a question regarding the timer broadcast.

The broadcast timer is setup to the next event and that will wake up any
idle cpu belonging to the "broadcast cpumask", right ?

The cpu which has been woken up will look for each cpu the next-event
and send an IPI to wake it up.

Although, it is possible the sender of this IPI may not be concerned by
the timer expiration and has been woken up just for sending the IPI, right ?


Correct.


If this is correct, is it possible to setup the timer irq affinity to a
cpu which will be concerned by the timer expiration ? so we prevent an
unnecessary wake up for a cpu.


It is possible, but we never implemented it.

If we go there, we want to make that conditional on a property flag,
because some interrupt controllers especially on x86 only allow to
move the affinity from interrupt context, which is pointless.


Thanks Thomas for your quick answer. I will write a RFC patchset.


Last year I implemented the affinity hook for broad-cast code and
experimented with it. Since the system I was using was dual core,
it wasn't much beneficial and hence gave up later. I did remember
discussing the approach with few folks in the conference.

Patch in the end of the email (also attached) for generic broadcast
code. I didn't look at all corner case though. In arch code then
you need to setup "broadcast_affinity" hook which should be able
to get handle of the arch irqchip and call the respective affinity
handler. Just 3 lines function should do the trick.

As Thomas said, effectiveness of such optimization solely depends
on how well the affinity (in low powers) supported by your IRQ chip.

Hope this is helpful for you.

Regards,
Santosh


From d70f2d48ec08a3f1d73187c49b16e4e60f81a50c Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar 
Date: Wed, 25 Jul 2012 03:42:33 +0530
Subject: [PATCH] tick-broadcast: Add tick road-cast affinity suport

Current tick broad-cast code has affinity set to the boot CPU and hence
the boot CPU will always wakeup from low power states when broad cast timer
is armed even if the next expiry event doesn't belong to it.

Patch adds broadcast affinity functionality to avoid above and let the
tick framework set the affinity of the event for the CPU it belongs.

Signed-off-by: Santosh Shilimkar 
---
 include/linux/clockchips.h   |2 ++
 kernel/time/tick-broadcast.c |   13 -
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h
index 8a7096f..5488cdc 100644
--- a/include/linux/clockchips.h
+++ b/include/linux/clockchips.h
@@ -95,6 +95,8 @@ struct clock_event_device {
unsigned long   retries;

void(*broadcast)(const struct cpumask *mask);
+   void(*broadcast_affinity)
+   (const struct cpumask *mask, int irq);
void(*set_mode)(enum clock_event_mode mode,
struct clock_event_device *);
void(*suspend)(struct clock_event_device *);
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index f113755..2ec2425 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -39,6 +39,8 @@ static void tick_broadcast_clear_oneshot(int cpu);
 static inline void tick_broadcast_clear_oneshot(int cpu) { }
 #endif

+static inline void dummy_broadcast_affinity(const struct cpumask *mask,
+   int irq) { }
 /*
  * Debugging: see timer_list.c
  */
@@ -485,14 +487,19 @@ void tick_broadcast_oneshot_control(unsigned long 
reason)

if (!cpumask_test_cpu(cpu, tick_get_broadcast_oneshot_mask())) {
cpumask_set_cpu(cpu, tick_get_broadcast_oneshot_mask());
clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
-   if (dev->next_event.tv64 < bc->next_event.tv64)
+   if (dev->next_event.tv64 < bc->next_event.tv64) {
tick_broadcast_set_event(dev->next_event, 1);
+   bc->broadcast_affinity(
+   tick_get_broadcast_oneshot_mask(), bc->irq);
+   }
}
} else {
if (cpumask_test_cpu(cpu, tick_get_broadcast_oneshot_mask())) {
cpumask_clear_cpu(cpu,
  tick_get_broadcast_oneshot_mask());
clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT);
+   bc->broadcast_affinity(
+

Re: [resend] Timer broadcast question

2013-02-21 Thread Santosh Shilimkar


On Thursday 21 February 2013 02:31 PM, Daniel Lezcano wrote:

On 02/21/2013 07:19 AM, Santosh Shilimkar wrote:

On Tuesday 19 February 2013 11:51 PM, Daniel Lezcano wrote:

On 02/19/2013 07:10 PM, Thomas Gleixner wrote:

On Tue, 19 Feb 2013, Daniel Lezcano wrote:

I am working on identifying the different wakeup sources from the
interrupts and I have a question regarding the timer broadcast.

The broadcast timer is setup to the next event and that will wake up
any
idle cpu belonging to the "broadcast cpumask", right ?

The cpu which has been woken up will look for each cpu the next-event
and send an IPI to wake it up.

Although, it is possible the sender of this IPI may not be concerned by
the timer expiration and has been woken up just for sending the IPI,
right ?


Correct.


If this is correct, is it possible to setup the timer irq affinity to a
cpu which will be concerned by the timer expiration ? so we prevent an
unnecessary wake up for a cpu.


It is possible, but we never implemented it.

If we go there, we want to make that conditional on a property flag,
because some interrupt controllers especially on x86 only allow to
move the affinity from interrupt context, which is pointless.


Thanks Thomas for your quick answer. I will write a RFC patchset.


Last year I implemented the affinity hook for broad-cast code and
experimented with it. Since the system I was using was dual core,
it wasn't much beneficial and hence gave up later. I did remember
discussing the approach with few folks in the conference.


I did a brief test with a similar patch on a ARM u8500 board. The timer
is tied with CPU0 by default, setting the dynamic irq affinity reduce
considerably the number of IPI. The difference with your patch is the
affinity is set to one CPU, the first one which is supposed to be wake
up by the timer expiration.

This is easy to spot with a small program doing usleep wired on CPU1.

We see CPU0 waking up to send an IPI to CPU1 and going to idle again.

I don't know how that behaves with OMAP4 with this patch (which I guess
it is the board you used), but the coupled idle state traces could be
ambiguous if you relied on it to check the benefit of this patch.


Across OMAP4 and OMAP5 based devices, only the general purpose OMAP5
devices the approach was useful. Rest of the devices had constraints
of master CPU(CPU0) waking up first always which in turns means pining
the affinity to that CPU always which the current code already does.
That was also another reason I didn't persue it further.


IMO, it is worth to implement such solution and perhaps we can extend it
to optimize the package idle time with the generic power domain tied
with the irq. Anyway, it is a random thought let's see that later :)


It is surely a good optimization especially for multi-core CPUIdle.


Patch in the end of the email (also attached) for generic broadcast
code. I didn't look at all corner case though. In arch code then
you need to setup "broadcast_affinity" hook which should be able
to get handle of the arch irqchip and call the respective affinity
handler. Just 3 lines function should do the trick.

As Thomas said, effectiveness of such optimization solely depends
on how well the affinity (in low powers) supported by your IRQ chip.

Hope this is helpful for you.


Thanks a lot for your patch and your feedbacks.


Am glad that it was helpful.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-21 Thread Santosh Shilimkar


Thomas,

On Thursday 21 February 2013 07:18 PM, Thomas Gleixner wrote:

Jason,

On Thu, 21 Feb 2013, Jason Liu wrote:

2013/2/21 Thomas Gleixner :

Now your explanation makes sense.

I have no fast solution for this, but I think that I have an idea how
to fix it. Stay tuned.


Thanks Thomas, wait for your fix. :)


find below a completely untested patch, which should address that issue.


After looking at the thread, I tried to see the issue on OMAP and could
see the same issue as Jason.

Your patch fixes the retries on both CPUs on my dual core machine. So
you use my tested by if you need one.
Tested-by: Santosh Shilimkar 

Thanks for the patch.
And thanks to Jason for spotting the issue.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Santosh Shilimkar


Thomas,

On Friday 22 February 2013 03:49 AM, Thomas Gleixner wrote:

On Thu, 21 Feb 2013, Santosh Shilimkar wrote:

On Thursday 21 February 2013 07:18 PM, Thomas Gleixner wrote:

find below a completely untested patch, which should address that issue.


After looking at the thread, I tried to see the issue on OMAP and could
see the same issue as Jason.


That's interesting. We have the same issue on x86 since 2007 and
nobody noticed ever. It's basically the same problem there, but it
seems that on x86 getting out of those low power states is way slower
than the minimal reprogramming delta which is used to enforce the
local timer to fire after the wakeup.

I'm still amazed that as Jason stated a 1us reprogramming delta is
sufficient to get this ping-pong going. I somehow doubt that, but
maybe ARM is really that fast :)


Your patch fixes the retries on both CPUs on my dual core machine. So
you use my tested by if you need one.


They are always welcome.


BTW, Lorenzo off-list mentioned to me about warning in boot-up
which I missed while testing your patch. It will take bit more
time for me to look into it and hence thought of reporting it.

[2.186126] [ cut here ]
[2.190979] WARNING: at kernel/time/tick-broadcast.c:501 
tick_broadcast_oneshot_control+0x1c0/0x21c()

[2.200622] Modules linked in:
[2.203826] [] (unwind_backtrace+0x0/0xf0) from 
[] (warn_slowpath_common+0x4c/0x64)
[2.213653] [] (warn_slowpath_common+0x4c/0x64) from 
[] (warn_slowpath_null+0x1c/0x24)
[2.223754] [] (warn_slowpath_null+0x1c/0x24) from 
[] (tick_broadcast_oneshot_control+0x1c0/0x21c)
[2.234924] [] (tick_broadcast_oneshot_control+0x1c0/0x21c) 
from [] (tick_notify+0x23c/0x42c)
[2.245666] [] (tick_notify+0x23c/0x42c) from [] 
(notifier_call_chain+0x44/0x84)
[2.255218] [] (notifier_call_chain+0x44/0x84) from 
[] (raw_notifier_call_chain+0x18/0x20)
[2.265686] [] (raw_notifier_call_chain+0x18/0x20) from 
[] (clockevents_notify+0x2c/0x174)
[2.276123] [] (clockevents_notify+0x2c/0x174) from 
[] (omap_enter_idle_smp+0x3c/0x120)
[2.286315] [] (omap_enter_idle_smp+0x3c/0x120) from 
[] (cpuidle_enter+0x14/0x18)
[2.295928] [] (cpuidle_enter+0x14/0x18) from [] 
(cpuidle_wrap_enter+0x34/0xa0)
[2.305389] [] (cpuidle_wrap_enter+0x34/0xa0) from 
[] (cpuidle_idle_call+0xe0/0x328)
[2.315307] [] (cpuidle_idle_call+0xe0/0x328) from 
[] (cpu_idle+0x8c/0x11c)
[2.324401] [] (cpu_idle+0x8c/0x11c) from [] 
(start_kernel+0x2b0/0x300)

[2.333129] ---[ end trace 6fe1f7b4606a9e20 ]---


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Santosh Shilimkar


On Friday 22 February 2013 03:54 PM, Thomas Gleixner wrote:

On Fri, 22 Feb 2013, Santosh Shilimkar wrote:

BTW, Lorenzo off-list mentioned to me about warning in boot-up
which I missed while testing your patch. It will take bit more
time for me to look into it and hence thought of reporting it.

[2.186126] [ cut here ]
[2.190979] WARNING: at kernel/time/tick-broadcast.c:501
tick_broadcast_oneshot_control+0x1c0/0x21c()


Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ?


The force broadcast mask one coming from below.
"WARN_ON_ONCE(test_bit(cpu, tick_force_broadcast_mask));"

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: too many timer retries happen when do local timer swtich with broadcast timer

2013-02-22 Thread Santosh Shilimkar


On Friday 22 February 2013 04:01 PM, Lorenzo Pieralisi wrote:

On Fri, Feb 22, 2013 at 10:24:00AM +, Thomas Gleixner wrote:

On Fri, 22 Feb 2013, Santosh Shilimkar wrote:

BTW, Lorenzo off-list mentioned to me about warning in boot-up
which I missed while testing your patch. It will take bit more
time for me to look into it and hence thought of reporting it.

[2.186126] [ cut here ]
[2.190979] WARNING: at kernel/time/tick-broadcast.c:501
tick_broadcast_oneshot_control+0x1c0/0x21c()


Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ?


It is the tick_force_broadcast_mask and I think that's because on all
systems we are testing, the broadcast timer IRQ is a thundering herd,
all CPUs get out of idle at once and try to get out of broadcast mode
at more or less the same time.


So the issue comes ups only when the idle state used where CPU wakeup
more or less at same time as Lorenzo mentioned. I have two platforms
where I could test the patch and see the issue only with one platform.

Yesterday I didn't notice the warning because it wasn't seen on that
platform :-) OMAP4 idle entry and exit in deep state is staggered
between CPUs and hence the warning isn't seen. On OMAP5 though,
there is an additional C-state where idle entry/exit for CPU
isn't staggered and I see the issue in that case.

Actually the broad-cast code doesn't expect such a behavior
from CPUs since only the broad-cast affine CPU should wake
up and rest of the CPU should be woken up by the broad-cast
IPIs.

Regards,
Santosh








--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/2] cpufreq: cpufreq-cpu0: provide compatibility string for DT matchup

2013-03-12 Thread Santosh Shilimkar

On Tuesday 12 March 2013 07:58 PM, Benoit Cousson wrote:
> On 03/12/2013 06:07 AM, Santosh Shilimkar wrote:
>> On Tuesday 12 March 2013 04:35 AM, Nishanth Menon wrote:
>>> commit 5553f9e (cpufreq: instantiate cpufreq-cpu0 as a platform_driver)
>>> now forces platform device to be registered for allowing cpufreq-cpu0
>>> to be used by SoCs. example: drivers/cpufreq/highbank-cpufreq.c
>>>
>>> However, for SoCs that wish to link up using device tree, instead
>>> of platform device, provide compatibility string match:
>>> compatible = "cpufreq,cpu0";
> 
> You cannot add a non-HW relative binding... DT is supposed to represent
> the pure HW.
> AFAIK, cpufreq has nothing to do with the HW definition.
> 
You are right. 

>>>
>>> Cc: "Rafael J. Wysocki" 
>>> Cc: Santosh Shilimkar 
>>> Cc: Shawn Guo 
>>> Cc: linux-kernel@vger.kernel.org
>>> Cc: cpuf...@vger.kernel.org
>>> Cc: linux...@vger.kernel.org
>>> Cc: linux-o...@vger.kernel.org
>>>
>>> Signed-off-by: Nishanth Menon 
>>> ---
>>>  .../devicetree/bindings/cpufreq/cpufreq-cpu0.txt   |3 +++
>>>  drivers/cpufreq/cpufreq-cpu0.c |6 ++
>>>  2 files changed, 9 insertions(+)
>>>
>> Looks fine to me. CC'ing dt list in case some one has
>> comments on binding updates.
>>
>> Acked-by: Santosh Shilimkar 
> 
> Not-Acked-by-me.
> 
I obviously missed the point while acking the patch.

Regards,
santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/7] tick: Optimize broadcast handling and prevent expiry ping pong

2013-03-13 Thread Santosh Shilimkar

Thomas,

On Wednesday 06 March 2013 04:48 PM, Thomas Gleixner wrote:
> Jason decoded a problem related to the broadcast timer mode. The
> reprogramming of the cpu local timer causes a huge number of
> retries. Also there is a situation where the CPU which does not handle
> the broadcast timer interrupt exits and reenters broadcast mode before
> the broadcast interrupt got handled by another CPU. This can lead to
> an interesting ping pong of the broadcast and the cpu local timer
> code.
> 
> This series addresses these problems. The first two patches convert
> the broadcast code to proper cpumask_var_t instead of adding more
> bitmaps later.
> 
> The rest of the series is adopted from the quick patches which I
> posted earlier while discussing the issue with Jason et. al.
> 
> Please give it a proper testing on your affected hardware.
> 
I have tested this revised patches on OMAP4 and OMAP5 platforms
with CPUidle enabled against 3.9-rc2. As expected they seems to
work without any issue and also fixes the reported retry issue.

Tested-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: hw_breakpoint: Enable debug powerdown only if system supports 'has_ossr'

2013-03-14 Thread Santosh Shilimkar

Will,

On Wednesday 13 March 2013 05:59 PM, Lokesh Vutla wrote:
> Hi Dietmar,
> On Wednesday 13 March 2013 05:35 PM, Dietmar Eggemann wrote:
>> On 13/03/13 06:52, Lokesh Vutla wrote:
>>> Commit {9a6eb31 ARM: hw_breakpoint: Debug powerdown support for
>>> self-hosted
>>> debug} introduces debug powerdown support for self-hosted debug.
>>> While merging the patch 'has_ossr' check was removed which
>>> was needed for hardwares which doesn't support self-hosted debug.
>>> Pandaboard (A9) is one such hardware and Dietmar's orginial
>>> patch did mention this issue.
>>> Without that check on Panda with CPUIDLE enabled, a flood of
>>> below messages thrown.
>>>
>>> [ 3.597930] hw-breakpoint: CPU 0 failed to disable vector catch
>>> [ 3.597991] hw-breakpoint: CPU 1 failed to disable vector catch
>>>
>>
>> Hi Lokesh,
>>
>> I confirm that this has_ossr condition has to go back into the
>> pm_init(void) call. I just verified it again on my Panda board and I get
>> the same issue like you without it.
>>
>> I guess what was happening is that Will asked me if this check is really
>> necessary and I said No without re-testing on my Panda board as an V7
>> debug architecture example where OSSR is not implemented. Since then I
>> only tested in on V7.1 debug architectures where OSSR is mandatory.
>> Sorry about this and thanks for catching this.
> Thanks for confirming..:)
> 
Can you please queue this one for 3.9-rc2+ ? Without this patch
CPUIDLE is unusable on OMAP4 devices because of those flood
of warning messages.

I was also wondering whether we should just warn once rather
than continuous warnings in the notifier. Patch is end of the
email.

Regards,
Santosh

>From b8db63f786719aef835f1ef4e20f3b3406b99b62 Mon Sep 17 00:00:00 2001
From: Santosh Shilimkar 
Date: Thu, 14 Mar 2013 13:03:25 +0530
Subject: [PATCH] ARM: hw_breakpoints: Use warn_once to avoid message flood on
 unsupported ossr machines

Signed-off-by: Santosh Shilimkar 
---
 arch/arm/kernel/hw_breakpoint.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
index 96093b7..5dc1aa6 100644
--- a/arch/arm/kernel/hw_breakpoint.c
+++ b/arch/arm/kernel/hw_breakpoint.c
@@ -966,7 +966,7 @@ static void reset_ctrl_regs(void *unused)
}
 
if (err) {
-   pr_warning("CPU %d debug is powered down!\n", cpu);
+   pr_warn_once("CPU %d debug is powered down!\n", cpu);
cpumask_or(&debug_err_mask, &debug_err_mask, cpumask_of(cpu));
return;
}
@@ -987,7 +987,7 @@ clear_vcr:
isb();
 
if (cpumask_intersects(&debug_err_mask, cpumask_of(cpu))) {
-   pr_warning("CPU %d failed to disable vector catch\n", cpu);
+   pr_warn_once("CPU %d failed to disable vector catch\n", cpu);
return;
}
 
@@ -1007,7 +1007,7 @@ clear_vcr:
}
 
if (cpumask_intersects(&debug_err_mask, cpumask_of(cpu))) {
-   pr_warning("CPU %d failed to clear debug register pairs\n", 
cpu);
+   pr_warn_once("CPU %d failed to clear debug register pairs\n", 
cpu);
return;
}
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/1] drivers: bus: Move the OMAP interconnect driver to drivers/bus/

2012-09-14 Thread Santosh Shilimkar

OMAP interconnect drivers are used for the interconnect error handling.
Since they are bus driver, lets move it to newly created drivers/bus.

Cc: Arnd Bergmann 
Cc: Tony Lindgren 
Tested-by: Lokesh Vutla 
Signed-off-by: Santosh Shilimkar 
---
Patch just moves OMAP interconnect drivers as is to the newly created
driver/bus/* directory. Patch is generated against "arm-soc/drivers/ocp2scp"
tree and test on all OMAP boards.

 arch/arm/mach-omap2/Kconfig|2 ++
 arch/arm/mach-omap2/Makefile   |5 -
 drivers/bus/Kconfig|6 ++
 drivers/bus/Makefile   |3 +++
 {arch/arm/mach-omap2 => drivers/bus}/omap_l3_noc.c |0
 {arch/arm/mach-omap2 => drivers/bus}/omap_l3_noc.h |0
 {arch/arm/mach-omap2 => drivers/bus}/omap_l3_smx.c |0
 {arch/arm/mach-omap2 => drivers/bus}/omap_l3_smx.h |0
 8 files changed, 11 insertions(+), 5 deletions(-)
 rename {arch/arm/mach-omap2 => drivers/bus}/omap_l3_noc.c (100%)
 rename {arch/arm/mach-omap2 => drivers/bus}/omap_l3_noc.h (100%)
 rename {arch/arm/mach-omap2 => drivers/bus}/omap_l3_smx.c (100%)
 rename {arch/arm/mach-omap2 => drivers/bus}/omap_l3_smx.h (100%)

diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index dd2db02..7d3c8ab 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -44,6 +44,7 @@ config ARCH_OMAP3
select ARM_CPU_SUSPEND if PM
select MULTI_IRQ_HANDLER
select SOC_HAS_OMAP2_SDRC
+   select OMAP_INTERCONNECT
 
 config ARCH_OMAP4
bool "TI OMAP4"
@@ -63,6 +64,7 @@ config ARCH_OMAP4
select USB_ARCH_HAS_EHCI if USB_SUPPORT
select ARM_CPU_SUSPEND if PM
select ARCH_NEEDS_CPU_IDLE_COUPLED
+   select OMAP_INTERCONNECT
 
 config SOC_OMAP5
bool "TI OMAP5"
diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
index f6a24b3..7fed980 100644
--- a/arch/arm/mach-omap2/Makefile
+++ b/arch/arm/mach-omap2/Makefile
@@ -199,11 +199,6 @@ obj-$(CONFIG_ARCH_OMAP4)   += 
omap_hwmod_44xx_data.o
 # EMU peripherals
 obj-$(CONFIG_OMAP3_EMU)+= emu.o
 
-# L3 interconnect
-obj-$(CONFIG_ARCH_OMAP3)   += omap_l3_smx.o
-obj-$(CONFIG_ARCH_OMAP4)   += omap_l3_noc.o
-obj-$(CONFIG_SOC_OMAP5)+= omap_l3_noc.o
-
 obj-$(CONFIG_OMAP_MBOX_FWK)+= mailbox_mach.o
 mailbox_mach-objs  := mailbox.o
 
diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig
index 6270415..bbec35d 100644
--- a/drivers/bus/Kconfig
+++ b/drivers/bus/Kconfig
@@ -12,4 +12,10 @@ config OMAP_OCP2SCP
  OCP2SCP and in OMAP5, both USB PHY and SATA PHY is connected via
  OCP2SCP.
 
+config OMAP_INTERCONNECT
+   tristate "OMAP INTERCONNECT DRIVER"
+   depends on ARCH_OMAP2PLUS
+
+   help
+ Driver to enable OMAP interconnect error handling driver.
 endmenu
diff --git a/drivers/bus/Makefile b/drivers/bus/Makefile
index 0ec50bc..45d997c 100644
--- a/drivers/bus/Makefile
+++ b/drivers/bus/Makefile
@@ -3,3 +3,6 @@
 #
 
 obj-$(CONFIG_OMAP_OCP2SCP) += omap-ocp2scp.o
+
+# Interconnect bus driver for OMAP SoCs.
+obj-$(CONFIG_OMAP_INTERCONNECT)+= omap_l3_smx.o omap_l3_noc.o
diff --git a/arch/arm/mach-omap2/omap_l3_noc.c b/drivers/bus/omap_l3_noc.c
similarity index 100%
rename from arch/arm/mach-omap2/omap_l3_noc.c
rename to drivers/bus/omap_l3_noc.c
diff --git a/arch/arm/mach-omap2/omap_l3_noc.h b/drivers/bus/omap_l3_noc.h
similarity index 100%
rename from arch/arm/mach-omap2/omap_l3_noc.h
rename to drivers/bus/omap_l3_noc.h
diff --git a/arch/arm/mach-omap2/omap_l3_smx.c b/drivers/bus/omap_l3_smx.c
similarity index 100%
rename from arch/arm/mach-omap2/omap_l3_smx.c
rename to drivers/bus/omap_l3_smx.c
diff --git a/arch/arm/mach-omap2/omap_l3_smx.h b/drivers/bus/omap_l3_smx.h
similarity index 100%
rename from arch/arm/mach-omap2/omap_l3_smx.h
rename to drivers/bus/omap_l3_smx.h
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: Push selects for TWD/SCU into machine entries

2012-10-04 Thread Santosh Shilimkar


On Thursday 04 October 2012 02:20 PM, Stephen Boyd wrote:

The TWD and SCU configs are selected by default as long as
SCORPIONMP is false and/or MCT is false. Implementing the logic
this way certainly saves lines in the Kconfig but it precludes
those machines which select SCORPIONMP or MCT from participating
in the single zImage effort because when those machines are
combined with other SMP capable machines the TWD and SCU are no
longer selected.

Push the select out to the machine entries so that we can compile
these machines together and still select the appropriate configs.

Signed-off-by: Stephen Boyd
Cc: David Brown
Cc: Kukjin Kim
Cc: Linus Walleij
Cc: Pawel Moll
Cc: Rob Herring
Cc: Russell King
Cc: Sascha Hauer
Cc: Shiraz Hashim
Cc: Simon Horman
Cc: Srinidhi Kasagar
Cc: Stephen Warren
Cc: Tony Lindgren
Cc: Viresh Kumar
---

Does OMAP5 need to select TWD? I suspect not if it uses the
architected timers.


Nope. OMAP5 don't use TWD. Infact the external SCU is also used
for A9 SOCs. You might want to check other A15 SOCS for SCU as
well.

[..]


  arch/arm/mach-omap2/Kconfig| 4 


[..]


diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig
index a6219ea..b618748 100644
--- a/arch/arm/mach-omap2/Kconfig
+++ b/arch/arm/mach-omap2/Kconfig
@@ -58,7 +58,9 @@ config ARCH_OMAP4
select CPU_V7
select ARM_GIC
select HAVE_SMP
+   select HAVE_ARM_SCU if SMP
select LOCAL_TIMERS if SMP
+   select HAVE_ARM_TWD if LOCAL_TIMERS
select PL310_ERRATA_588369
select PL310_ERRATA_727915
select ARM_ERRATA_720789

Ok.


@@ -75,6 +77,8 @@ config SOC_OMAP5
select CPU_V7
select ARM_GIC
select HAVE_SMP
+   select HAVE_ARM_SCU if SMP
+   select HAVE_ARM_TWD if LOCAL_TIMERS
select ARM_CPU_SUSPEND if PM
select SOC_HAS_REALTIME_COUNTER
select ARM_ARCH_TIMER

Drop this change.

With that fixed, for OMAP changes

Acked-by: Santosh Shilimkar 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: Push selects for TWD/SCU into machine entries

2012-10-04 Thread Santosh Shilimkar


On Friday 05 October 2012 06:42 AM, Simon Horman wrote:

On Thu, Oct 04, 2012 at 02:41:46PM +0530, Santosh Shilimkar wrote:

On Thursday 04 October 2012 02:20 PM, Stephen Boyd wrote:

The TWD and SCU configs are selected by default as long as
SCORPIONMP is false and/or MCT is false. Implementing the logic
this way certainly saves lines in the Kconfig but it precludes
those machines which select SCORPIONMP or MCT from participating
in the single zImage effort because when those machines are
combined with other SMP capable machines the TWD and SCU are no
longer selected.

Push the select out to the machine entries so that we can compile
these machines together and still select the appropriate configs.

Signed-off-by: Stephen Boyd
Cc: David Brown
Cc: Kukjin Kim
Cc: Linus Walleij
Cc: Pawel Moll
Cc: Rob Herring
Cc: Russell King
Cc: Sascha Hauer
Cc: Shiraz Hashim
Cc: Simon Horman
Cc: Srinidhi Kasagar
Cc: Stephen Warren
Cc: Tony Lindgren
Cc: Viresh Kumar
---

Does OMAP5 need to select TWD? I suspect not if it uses the
architected timers.


Nope. OMAP5 don't use TWD. Infact the external SCU is also used
for A9 SOCs. You might want to check other A15 SOCS for SCU as
well.


In that case I am a bit confused by the following result:

# git checkout v3.6
# ARCH=arm make omap2plus_defconfig
# grep '(SOC_OMAP5|_SCU|_TWD|CONFIG_SMP|CONFIG_LOCAL_TIMERS)=' .config
CONFIG_SOC_OMAP5=y
CONFIG_SMP=y
CONFIG_HAVE_ARM_SCU=y
CONFIG_HAVE_ARM_TWD=y
CONFIG_LOCAL_TIMERS=y


Thats because omap2plus_defconfig build all the OMAP machines together
including OMAP4 and OMAP5.

Regards
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 2/6] sched: add a new SD SHARE_POWERLINE flag for sched_domain

2012-10-24 Thread Santosh Shilimkar


Vincent,

Few comments/questions.

On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:

This new flag SD SHARE_POWERLINE reflects the sharing of the power rail
between the members of a domain. As this is the current assumption of the
scheduler, the flag is added to all sched_domain

Signed-off-by: Vincent Guittot 
---
  arch/ia64/include/asm/topology.h |1 +
  arch/tile/include/asm/topology.h |1 +
  include/linux/sched.h|1 +
  include/linux/topology.h |3 +++
  kernel/sched/core.c  |5 +
  5 files changed, 11 insertions(+)

diff --git a/arch/ia64/include/asm/topology.h b/arch/ia64/include/asm/topology.h
index a2496e4..065c720 100644
--- a/arch/ia64/include/asm/topology.h
+++ b/arch/ia64/include/asm/topology.h
@@ -65,6 +65,7 @@ void build_cpu_to_node_map(void);
| SD_BALANCE_EXEC   \
| SD_BALANCE_FORK   \
| SD_WAKE_AFFINE,   \
+   | arch_sd_share_power_line()\
.last_balance   = jiffies,  \
.balance_interval   = 1,\
.nr_balance_failed  = 0,\
diff --git a/arch/tile/include/asm/topology.h b/arch/tile/include/asm/topology.h
index 7a7ce39..d39ed0b 100644
--- a/arch/tile/include/asm/topology.h
+++ b/arch/tile/include/asm/topology.h
@@ -72,6 +72,7 @@ static inline const struct cpumask *cpumask_of_node(int node)
| 0*SD_PREFER_LOCAL \
| 0*SD_SHARE_CPUPOWER   \
| 0*SD_SHARE_PKG_RESOURCES  \
+   | arch_sd_share_power_line()\
| 0*SD_SERIALIZE\
,   \
.last_balance   = jiffies,  \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4786b20..74f2daf 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -862,6 +862,7 @@ enum cpu_idle_type {
  #define SD_WAKE_AFFINE0x0020  /* Wake task to waking CPU */
  #define SD_PREFER_LOCAL   0x0040  /* Prefer to keep tasks local 
to this domain */
  #define SD_SHARE_CPUPOWER 0x0080  /* Domain members share cpu power */
+#define SD_SHARE_POWERLINE 0x0100  /* Domain members share power domain */

If you ignore the current use of SD_SHARE_CPUPOWER, isn't the meaning of
CPUPOWER and POWERLINE is same here. Just trying to understand the clear
meaning of this new flag. Have you not considered SD_SHARE_CPUPOWER
because it is being used for cpu_power and needs at least minimum two
domains ? SD_PACKING would have been probably more appropriate based
on the way it is being used in further series.

Regards
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 3/6] sched: pack small tasks

2012-10-24 Thread Santosh Shilimkar


Vincent,

Few comments/questions.

On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:

During sched_domain creation, we define a pack buddy CPU if available.

On a system that share the powerline at all level, the buddy is set to -1

On a dual clusters / dual cores system which can powergate each core and
cluster independantly, the buddy configuration will be :
   | CPU0 | CPU1 | CPU2 | CPU3 |
---
buddy | CPU0 | CPU0 | CPU0 | CPU2 |

^
Is that a typo ? Should it be CPU2 instead of
CPU0 ?


Small tasks tend to slip out of the periodic load balance.
The best place to choose to migrate them is at their wake up.


I have tried this series since I was looking at some of these packing
bits. On Mobile workloads like OSIdle with Screen ON, MP3, gallary,
I did see some additional filtering of threads with this series
but its not making much difference in power. More on this below.


Signed-off-by: Vincent Guittot 
---
  kernel/sched/core.c  |1 +
  kernel/sched/fair.c  |  109 ++
  kernel/sched/sched.h |1 +
  3 files changed, 111 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index dab7908..70cadbe 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6131,6 +6131,7 @@ cpu_attach_domain(struct sched_domain *sd, struct 
root_domain *rd, int cpu)
rcu_assign_pointer(rq->sd, sd);
destroy_sched_domains(tmp, cpu);

+   update_packing_domain(cpu);
update_top_cache_domain(cpu);
  }

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4f4a4f6..8c9d3ed 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -157,6 +157,63 @@ void sched_init_granularity(void)
update_sysctl();
  }

+
+/*
+ * Save the id of the optimal CPU that should be used to pack small tasks
+ * The value -1 is used when no buddy has been found
+ */
+DEFINE_PER_CPU(int, sd_pack_buddy);
+
+/* Look for the best buddy CPU that can be used to pack small tasks
+ * We make the assumption that it doesn't wort to pack on CPU that share the

s/wort/worth

+ * same powerline. We looks for the 1st sched_domain without the
+ * SD_SHARE_POWERLINE flag. Then We look for the sched_group witht the lowest
+ * power per core based on the assumption that their power efficiency is
+ * better */

Commenting style..
/*
 *
 */

Can you please expand the why the assumption is right ?
"it doesn't wort to pack on CPU that share the same powerline"

Think about a scenario where you have quad core, ducal cluster system

|Cluster1|  |cluster 2|
| CPU0 | CPU1 | CPU2 | CPU3 |   | CPU0 | CPU1 | CPU2 | CPU3 |


Both clusters run from same voltage rail and have same PLL
clocking them. But the cluster have their own power domain
and all CPU's can power gate them-self to low power states.
Clusters also have their own level2 caches.

In this case, you will still save power if you try to pack
load on one cluster. No ?


+void update_packing_domain(int cpu)
+{
+   struct sched_domain *sd;
+   int id = -1;
+
+   sd = highest_flag_domain(cpu, SD_SHARE_POWERLINE);
+   if (!sd)
+   sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd);
+   else
+   sd = sd->parent;
+
+   while (sd) {
+   struct sched_group *sg = sd->groups;
+   struct sched_group *pack = sg;
+   struct sched_group *tmp = sg->next;
+
+   /* 1st CPU of the sched domain is a good candidate */
+   if (id == -1)
+   id = cpumask_first(sched_domain_span(sd));
+
+   /* loop the sched groups to find the best one */
+   while (tmp != sg) {
+   if (tmp->sgp->power * sg->group_weight <
+   sg->sgp->power * tmp->group_weight)
+   pack = tmp;
+   tmp = tmp->next;
+   }
+
+   /* we have found a better group */
+   if (pack != sg)
+   id = cpumask_first(sched_group_cpus(pack));
+
+   /* Look for another CPU than itself */
+   if ((id != cpu)
+|| ((sd->parent) && !(sd->parent->flags && SD_LOAD_BALANCE)))

Is the condition "!(sd->parent->flags && SD_LOAD_BALANCE)" for
big.LITTLE kind of system where SD_LOAD_BALANCE may not be used ?


+   break;
+
+   sd = sd->parent;
+   }
+
+   pr_info(KERN_INFO "CPU%d packing on CPU%d\n", cpu, id);
+   per_cpu(sd_pack_buddy, cpu) = id;
+}
+
  #if BITS_PER_LONG == 32
  # define WMULT_CONST  (~0UL)
  #else
@@ -3073,6 +3130,55 @@ static int select_idle_sibling(struct task_struct *p, 
int target)
return target;
  }

+static inline bool is_buddy_busy(int cpu)
+{
+   struct rq *rq = cpu_rq(cpu);
+
+   /*
+* A busy buddy is a CPU with a high load or a small load with a lot of

Re: [RFC 4/6] sched: secure access to other CPU statistics

2012-10-24 Thread Santosh Shilimkar


$subject is bit confusing here.

On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:

The atomic update of runnable_avg_sum and runnable_avg_period are ensured
by their size and the toolchain. But we must ensure to not read an old value
for one field and a newly updated value for the other field. As we don't
want to lock other CPU while reading these fields, we read twice each fields
and check that no change have occured in the middle.

Signed-off-by: Vincent Guittot 
---
  kernel/sched/fair.c |   19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8c9d3ed..6df53b5 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3133,13 +3133,28 @@ static int select_idle_sibling(struct task_struct *p, 
int target)
  static inline bool is_buddy_busy(int cpu)
  {
struct rq *rq = cpu_rq(cpu);
+   volatile u32 *psum = &rq->avg.runnable_avg_sum;
+   volatile u32 *pperiod = &rq->avg.runnable_avg_period;
+   u32 sum, new_sum, period, new_period;
+   int timeout = 10;

So it can be 2 times read or more as well.

+
+   while (timeout) {
+   sum = *psum;
+   period = *pperiod;
+   new_sum = *psum;
+   new_period = *pperiod;
+
+   if ((sum == new_sum) && (period == new_period))
+   break;
+
+   timeout--;
+   }


Seems like you did notice incorrect pair getting read
for rq runnable_avg_sum and runnable_avg_period. Seems
like the fix is to update them together under some lock
to avoid such issues.

Regards
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 5/6] sched: pack the idle load balance

2012-10-24 Thread Santosh Shilimkar


On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:

Look for an idle CPU close the pack buddy CPU whenever possible.

s/close/close to

The goal is to prevent the wake up of a CPU which doesn't share the power
line of the pack CPU

Signed-off-by: Vincent Guittot 
---
  kernel/sched/fair.c |   18 ++
  1 file changed, 18 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6df53b5..f76acdc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5158,7 +5158,25 @@ static struct {

  static inline int find_new_ilb(int call_cpu)
  {
+   struct sched_domain *sd;
int ilb = cpumask_first(nohz.idle_cpus_mask);
+   int buddy = per_cpu(sd_pack_buddy, call_cpu);
+
+   /*
+* If we have a pack buddy CPU, we try to run load balance on a CPU
+* that is close to the buddy.
+*/
+   if (buddy != -1)
+   for_each_domain(buddy, sd) {
+   if (sd->flags & SD_SHARE_CPUPOWER)
+   continue;

Do you mean SD_SHARE_POWERLINE here ?

+
+   ilb = cpumask_first_and(sched_domain_span(sd),
+   nohz.idle_cpus_mask);
+
+   if (ilb < nr_cpu_ids)
+   break;
+   }

if (ilb < nr_cpu_ids && idle_cpu(ilb))
return ilb;


Can you please expand "idle CPU _close_ the pack buddy CPU" ?

Regards
santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 6/6] ARM: sched: clear SD_SHARE_POWERLINE

2012-10-24 Thread Santosh Shilimkar


On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:

The ARM platforms take advantage of packing small tasks on few cores.
This is true even when the cores of a cluster can't be powergated
independently.

Signed-off-by: Vincent Guittot 
---
  arch/arm/kernel/topology.c |5 +
  1 file changed, 5 insertions(+)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 26c12c6..00511d0 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -226,6 +226,11 @@ static inline void update_cpu_power(unsigned int cpuid, 
unsigned int mpidr) {}
   */
  struct cputopo_arm cpu_topology[NR_CPUS];

+int arch_sd_share_power_line(void)
+{
+   return 0*SD_SHARE_POWERLINE;
+}


Making this selection of policy based on sched domain will better. Just
gives the flexibility to choose a separate scheme for big and little
systems which will be very convenient.

Regards
Santosh





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFT] ARM64: dma-mapping: support debug_dma_mapping_error

2012-10-26 Thread Santosh Shilimkar


On Friday 26 October 2012 04:17 AM, Shuah Khan wrote:

Add support for debug_dma_mapping_error() call to avoid warning from
debug_dma_unmap() interface when it checks for mapping error checked
status. Without this patch, device driver failed to check map error
warning is generated.

Signed-off-by: Shuah Khan 
---

Looks good. Similar fix for 32bit ARM port is done in below commit.

From 871ae57adc5ed092c1341f411514d0e8482e2611 Mon Sep 17 00:00:00 2001
From: Ming Lei 
Date: Mon, 22 Oct 2012 20:44:03 +0800
Subject: [PATCH] ARM: dma-mapping: support debug_dma_mapping_error


Regards
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] ARM: OMAP4: ID: Improve features detection and check

2012-11-01 Thread Santosh Shilimkar


On Thursday 01 November 2012 03:53 PM, Ivan Khoronzhuk wrote:

Replaces several flags bearing the same meaning. There is no need
to set flags due to different omap types here, it can be checked
in appropriate places as well.

Cc: Tony Lindgren 
Cc: Russell King 
Cc: linux-o...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ivan Khoronzhuk 
---
  arch/arm/mach-omap2/id.c  |   25 +++--
  arch/arm/mach-omap2/soc.h |8 ++--
  2 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/arch/arm/mach-omap2/id.c b/arch/arm/mach-omap2/id.c
index cf2362c..3c47a19 100644
--- a/arch/arm/mach-omap2/id.c
+++ b/arch/arm/mach-omap2/id.c
@@ -28,6 +28,9 @@
  #include "soc.h"
  #include "control.h"

+#define OMAP4_SILICON_TYPE_STANDARD0x01
+#define OMAP4_SILICON_TYPE_PERFORMANCE 0x02
+
  static unsigned int omap_revision;
  static const char *cpu_rev;
  u32 omap_features;
@@ -273,25 +276,11 @@ void __init omap4xxx_check_features(void)
  {
u32 si_type;

-   if (cpu_is_omap443x())
-   omap_features |= OMAP4_HAS_MPU_1GHZ;
-
+   si_type =
+   (read_tap_reg(OMAP4_CTRL_MODULE_CORE_STD_FUSE_PROD_ID_1) >> 16) & 0x03;

-   if (cpu_is_omap446x()) {
-   si_type =
-   read_tap_reg(OMAP4_CTRL_MODULE_CORE_STD_FUSE_PROD_ID_1);
-   switch ((si_type & (3 << 16)) >> 16) {
-   case 2:
-   /* High performance device */
-   omap_features |= OMAP4_HAS_MPU_1_5GHZ;
-   break;
-   case 1:
-   default:
-   /* Standard device */
-   omap_features |= OMAP4_HAS_MPU_1_2GHZ;
-   break;
-   }
-   }
+   if (si_type == OMAP4_SILICON_TYPE_PERFORMANCE)
+   omap_features = OMAP4_HAS_PERF_SILICON;


Well the detection isn't for performance/standard but there are some
features depend on it. For example 1 GHz doesn't DPLL DCC enable feature
where as 1.2 GHz, 1.5 GHz doesn't need. This is the main reason this
information is also effused. Have you considered this aspect while
creating this patch ?

Regards
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] ARM: OMAP4: ID: Improve features detection and check

2012-11-01 Thread Santosh Shilimkar


On Thursday 01 November 2012 09:50 PM, ivan.khoronzhuk wrote:

On 11/01/2012 01:35 PM, Santosh Shilimkar wrote:

On Thursday 01 November 2012 03:53 PM, Ivan Khoronzhuk wrote:

Replaces several flags bearing the same meaning. There is no need
to set flags due to different omap types here, it can be checked
in appropriate places as well.

Cc: Tony Lindgren 
Cc: Russell King 
Cc: linux-o...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ivan Khoronzhuk 
---
  arch/arm/mach-omap2/id.c  |   25 +++--
  arch/arm/mach-omap2/soc.h |8 ++--
  2 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/arch/arm/mach-omap2/id.c b/arch/arm/mach-omap2/id.c
index cf2362c..3c47a19 100644
--- a/arch/arm/mach-omap2/id.c
+++ b/arch/arm/mach-omap2/id.c
@@ -28,6 +28,9 @@
  #include "soc.h"
  #include "control.h"

+#define OMAP4_SILICON_TYPE_STANDARD0x01
+#define OMAP4_SILICON_TYPE_PERFORMANCE0x02
+
  static unsigned int omap_revision;
  static const char *cpu_rev;
  u32 omap_features;
@@ -273,25 +276,11 @@ void __init omap4xxx_check_features(void)
  {
  u32 si_type;

-if (cpu_is_omap443x())
-omap_features |= OMAP4_HAS_MPU_1GHZ;
-
+si_type =
+(read_tap_reg(OMAP4_CTRL_MODULE_CORE_STD_FUSE_PROD_ID_1) >> 16)
& 0x03;

-if (cpu_is_omap446x()) {
-si_type =
- read_tap_reg(OMAP4_CTRL_MODULE_CORE_STD_FUSE_PROD_ID_1);
-switch ((si_type & (3 << 16)) >> 16) {
-case 2:
-/* High performance device */
-omap_features |= OMAP4_HAS_MPU_1_5GHZ;
-break;
-case 1:
-default:
-/* Standard device */
-omap_features |= OMAP4_HAS_MPU_1_2GHZ;
-break;
-}
-}
+if (si_type == OMAP4_SILICON_TYPE_PERFORMANCE)
+omap_features = OMAP4_HAS_PERF_SILICON;


Well the detection isn't for performance/standard but there are some
features depend on it. For example 1 GHz doesn't DPLL DCC enable feature
where as 1.2 GHz, 1.5 GHz doesn't need. This is the main reason this
information is also effused. Have you considered this aspect while
creating this patch ?

Regards
Santosh



I had considered it. There is no dependency on the features.
DCC usage depends on asked frequency on the fly, not from the features.
Depending on "performance/standard" feature the available frequencies
should be chosen in places where they are needed, for example while
initializing OPPs.


You are correct about the way DCC is handled in the clock code. Infact
all these features like L2CACHE, SGX, IVA etc is more for to display
boot messages.


Currently we have several features while it is only one indeed.


1GHz, 1.2GHz, 1.5 GHz are not same since the silicon capability itself
is different.

git blame tells me that Nishant has sent this update so looping him
if above differentiation in boot log helps.

Nishant,
What's your take on removing above freq prints and marking
those silicon as performance silicons as the $subject patch does ?

Regards
Santosh






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] ARM: OMAP4: ID: Improve features detection and check

2012-11-01 Thread Santosh Shilimkar


On Thursday 01 November 2012 10:36 PM, Nishanth Menon wrote:

On 22:05-20121101, Santosh Shilimkar wrote:

On Thursday 01 November 2012 09:50 PM, ivan.khoronzhuk wrote:

On 11/01/2012 01:35 PM, Santosh Shilimkar wrote:

On Thursday 01 November 2012 03:53 PM, Ivan Khoronzhuk wrote:

Replaces several flags bearing the same meaning. There is no need
to set flags due to different omap types here, it can be checked
in appropriate places as well.

Cc: Tony Lindgren 
Cc: Russell King 
Cc: linux-o...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ivan Khoronzhuk 
---


[..]

+if (si_type == OMAP4_SILICON_TYPE_PERFORMANCE)
+omap_features = OMAP4_HAS_PERF_SILICON;


Well the detection isn't for performance/standard but there are some
features depend on it. For example 1 GHz doesn't DPLL DCC enable feature
where as 1.2 GHz, 1.5 GHz doesn't need. This is the main reason this
information is also effused. Have you considered this aspect while
creating this patch ?

Regards
Santosh



I had considered it. There is no dependency on the features.
DCC usage depends on asked frequency on the fly, not from the features.
Depending on "performance/standard" feature the available frequencies
should be chosen in places where they are needed, for example while
initializing OPPs.


You are correct about the way DCC is handled in the clock code. Infact
all these features like L2CACHE, SGX, IVA etc is more for to display
boot messages.


Currently we have several features while it is only one indeed.


1GHz, 1.2GHz, 1.5 GHz are not same since the silicon capability itself
is different.

git blame tells me that Nishant has sent this update so looping him
if above differentiation in boot log helps.

Nishant,
What's your take on removing above freq prints and marking
those silicon as performance silicons as the $subject patch does ?

Regards
Santosh

Yes $subject patch is a better approach compared to having freq based
handling which just increases the number of macros we need to enable
depending on SoC variants that we spin off the main SoC. This also
allows us to conserve the features bitfield ahead as well.

I hate to admit, but after a couple of generations of SoC spinoffs,
my original logic is proving to be was pretty short sighted,
unfortunately :(

So, approach
Acked-by: Nishanth Menon 


Thanks Nishant for clarification and ack.

With the clarification I also like the subject patch.
Feel free add.

Acked-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 2/6] sched: add a new SD SHARE_POWERLINE flag for sched_domain

2012-11-02 Thread Santosh Shilimkar


On Monday 29 October 2012 03:20 PM, Vincent Guittot wrote:

  It looks like i need to describe more what

On 29 October 2012 10:40, Vincent Guittot  wrote:

On 24 October 2012 17:17, Santosh Shilimkar  wrote:

Vincent,

Few comments/questions.


On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:


This new flag SD SHARE_POWERLINE reflects the sharing of the power rail
between the members of a domain. As this is the current assumption of the
scheduler, the flag is added to all sched_domain

Signed-off-by: Vincent Guittot 
---
   arch/ia64/include/asm/topology.h |1 +
   arch/tile/include/asm/topology.h |1 +
   include/linux/sched.h|1 +
   include/linux/topology.h |3 +++
   kernel/sched/core.c  |5 +
   5 files changed, 11 insertions(+)

diff --git a/arch/ia64/include/asm/topology.h
b/arch/ia64/include/asm/topology.h
index a2496e4..065c720 100644
--- a/arch/ia64/include/asm/topology.h
+++ b/arch/ia64/include/asm/topology.h
@@ -65,6 +65,7 @@ void build_cpu_to_node_map(void);
 | SD_BALANCE_EXEC   \
 | SD_BALANCE_FORK   \
 | SD_WAKE_AFFINE,   \
+   | arch_sd_share_power_line()\
 .last_balance   = jiffies,  \
 .balance_interval   = 1,\
 .nr_balance_failed  = 0,\
diff --git a/arch/tile/include/asm/topology.h
b/arch/tile/include/asm/topology.h
index 7a7ce39..d39ed0b 100644
--- a/arch/tile/include/asm/topology.h
+++ b/arch/tile/include/asm/topology.h
@@ -72,6 +72,7 @@ static inline const struct cpumask *cpumask_of_node(int
node)
 | 0*SD_PREFER_LOCAL \
 | 0*SD_SHARE_CPUPOWER   \
 | 0*SD_SHARE_PKG_RESOURCES  \
+   | arch_sd_share_power_line()\
 | 0*SD_SERIALIZE\
 ,   \
 .last_balance   = jiffies,  \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4786b20..74f2daf 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -862,6 +862,7 @@ enum cpu_idle_type {
   #define SD_WAKE_AFFINE0x0020  /* Wake task to waking CPU
*/
   #define SD_PREFER_LOCAL   0x0040  /* Prefer to keep tasks
local to this domain */
   #define SD_SHARE_CPUPOWER 0x0080  /* Domain members share cpu power
*/
+#define SD_SHARE_POWERLINE 0x0100  /* Domain members share power
domain */


If you ignore the current use of SD_SHARE_CPUPOWER, isn't the meaning of
CPUPOWER and POWERLINE is same here. Just trying to understand the clear
meaning of this new flag. Have you not considered SD_SHARE_CPUPOWER
because it is being used for cpu_power and needs at least minimum two
domains ? SD_PACKING would have been probably more appropriate based
on the way it is being used in further series.


CPUPOWER reflects the share of hw ressources between cores like for
hyper threading. POWERLINE describes the fact that cores are sharing
the same power line amore precisely the powergate.


Sorry, the mail has been sent too early while I was writing it

CPUPOWER reflects the share of hw ressources between cores like for
hyper threading. POWERLINE describes the fact that cores are sharing
the same power line and more precisely the same power gating. It looks
like I need to describe more precisely what i would mean with
SHARE_POWERLINE.


Yes. More description will help. I see bit of overlap POWERLINE
flag with SD_SHARE_CPUPOWER and SD_SHARE_PKG_RESOURCES and hence
the questions.



I don't want to use PACKING because it's more a behavior than a
feature. If cores can power gate independently (!SD_SHARE_POWERLINE),
packing small tasks is one interesting behavior but it may be not the
only one. I want to make a difference between the HW configuration and
the behavior we want to have above it


Fair enough. Thanks for clarification.

Regards,
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 3/6] sched: pack small tasks

2012-11-02 Thread Santosh Shilimkar


On Monday 29 October 2012 06:42 PM, Vincent Guittot wrote:

On 24 October 2012 17:20, Santosh Shilimkar  wrote:

Vincent,

Few comments/questions.


On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:


During sched_domain creation, we define a pack buddy CPU if available.

On a system that share the powerline at all level, the buddy is set to -1

On a dual clusters / dual cores system which can powergate each core and
cluster independantly, the buddy configuration will be :
| CPU0 | CPU1 | CPU2 | CPU3 |
---
buddy | CPU0 | CPU0 | CPU0 | CPU2 |


 ^
Is that a typo ? Should it be CPU2 instead of
CPU0 ?


No it's not a typo.
The system packs at each scheduling level. It starts to pack in
cluster because each core can power gate independently so CPU1 tries
to pack its tasks in CPU0 and CPU3 in CPU2. Then, it packs at CPU
level so CPU2 tries to pack in the cluster of CPU0 and CPU0 packs in
itself


I get it. Though in above example a task may migrate from say
CPU3->CPU2->CPU0 as part of packing. I was just thinking whether
moving such task from say CPU3 to CPU0 might be best instead.




Small tasks tend to slip out of the periodic load balance.
The best place to choose to migrate them is at their wake up.


I have tried this series since I was looking at some of these packing
bits. On Mobile workloads like OSIdle with Screen ON, MP3, gallary,
I did see some additional filtering of threads with this series
but its not making much difference in power. More on this below.


Can I ask you which configuration you have used ? how many cores and
cluster ?  Can they be power gated independently ?


I have been trying with couple of setups. Dual Core ARM machine and
Quad core X86 box with single package thought most of the mobile
workload analysis I was doing on ARM machine. On both setups
CPUs can be gated independently.





Signed-off-by: Vincent Guittot 
---
   kernel/sched/core.c  |1 +
   kernel/sched/fair.c  |  109
++
   kernel/sched/sched.h |1 +
   3 files changed, 111 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index dab7908..70cadbe 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6131,6 +6131,7 @@ cpu_attach_domain(struct sched_domain *sd, struct
root_domain *rd, int cpu)
 rcu_assign_pointer(rq->sd, sd);
 destroy_sched_domains(tmp, cpu);

+   update_packing_domain(cpu);
 update_top_cache_domain(cpu);
   }

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4f4a4f6..8c9d3ed 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -157,6 +157,63 @@ void sched_init_granularity(void)
 update_sysctl();
   }

+
+/*
+ * Save the id of the optimal CPU that should be used to pack small tasks
+ * The value -1 is used when no buddy has been found
+ */
+DEFINE_PER_CPU(int, sd_pack_buddy);
+
+/* Look for the best buddy CPU that can be used to pack small tasks
+ * We make the assumption that it doesn't wort to pack on CPU that share
the


s/wort/worth


yes




+ * same powerline. We looks for the 1st sched_domain without the
+ * SD_SHARE_POWERLINE flag. Then We look for the sched_group witht the
lowest
+ * power per core based on the assumption that their power efficiency is
+ * better */


Commenting style..
/*
  *
  */



yes


Can you please expand the why the assumption is right ?
"it doesn't wort to pack on CPU that share the same powerline"


By "share the same power-line", I mean that the CPUs can't power off
independently. So if some CPUs can't power off independently, it's
worth to try to use most of them to race to idle.


In that case I suggest we use different word here. Power line can be
treated as voltage line, power domain.
May be SD_SHARE_CPU_POWERDOMAIN ?



Think about a scenario where you have quad core, ducal cluster system

 |Cluster1|  |cluster 2|
| CPU0 | CPU1 | CPU2 | CPU3 |   | CPU0 | CPU1 | CPU2 | CPU3 |


Both clusters run from same voltage rail and have same PLL
clocking them. But the cluster have their own power domain
and all CPU's can power gate them-self to low power states.
Clusters also have their own level2 caches.

In this case, you will still save power if you try to pack
load on one cluster. No ?


yes, I need to update the description of SD_SHARE_POWERLINE because
I'm afraid I was not clear enough. SD_SHARE_POWERLINE includes the
power gating capacity of each core. For your example above, the
SD_SHARE_POWERLINE shoud be cleared at both MC and CPU level.


Thanks for clarification.





+void update_packing_domain(int cpu)
+{
+   struct sched_domain *sd;
+   int id = -1;
+
+   sd = highest_flag_domain(cpu, SD_SHARE_POWERLINE);
+   if (!sd)
+   sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd);
+   else
+   sd = sd->

Re: [RFC 5/6] sched: pack the idle load balance

2012-11-02 Thread Santosh Shilimkar


On Monday 29 October 2012 06:57 PM, Vincent Guittot wrote:

On 24 October 2012 17:21, Santosh Shilimkar  wrote:

On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:


Look for an idle CPU close the pack buddy CPU whenever possible.


s/close/close to


yes




The goal is to prevent the wake up of a CPU which doesn't share the power
line of the pack CPU

Signed-off-by: Vincent Guittot 
---
   kernel/sched/fair.c |   18 ++
   1 file changed, 18 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6df53b5..f76acdc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5158,7 +5158,25 @@ static struct {

   static inline int find_new_ilb(int call_cpu)
   {
+   struct sched_domain *sd;
 int ilb = cpumask_first(nohz.idle_cpus_mask);
+   int buddy = per_cpu(sd_pack_buddy, call_cpu);
+
+   /*
+* If we have a pack buddy CPU, we try to run load balance on a
CPU
+* that is close to the buddy.
+*/
+   if (buddy != -1)
+   for_each_domain(buddy, sd) {
+   if (sd->flags & SD_SHARE_CPUPOWER)
+   continue;


Do you mean SD_SHARE_POWERLINE here ?


No, I just don't want to take hyperthread level for ILB




+
+   ilb = cpumask_first_and(sched_domain_span(sd),
+   nohz.idle_cpus_mask);
+
+   if (ilb < nr_cpu_ids)
+   break;
+   }

 if (ilb < nr_cpu_ids && idle_cpu(ilb))
 return ilb;


Can you please expand "idle CPU _close_ the pack buddy CPU" ?


The goal is to packed  the tasks on the pack buddy CPU so when the
scheduler needs to start ILB, I try to wake up a CPU that is close to
the buddy and preferably in the same cluster


I see your point now. Thanks for clarification.

Regards
santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 6/6] ARM: sched: clear SD_SHARE_POWERLINE

2012-11-02 Thread Santosh Shilimkar


On Monday 29 October 2012 06:58 PM, Vincent Guittot wrote:

On 24 October 2012 17:21, Santosh Shilimkar  wrote:

On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote:


The ARM platforms take advantage of packing small tasks on few cores.
This is true even when the cores of a cluster can't be powergated
independently.


Signed-off-by: Vincent Guittot 
---
   arch/arm/kernel/topology.c |5 +
   1 file changed, 5 insertions(+)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index 26c12c6..00511d0 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -226,6 +226,11 @@ static inline void update_cpu_power(unsigned int
cpuid, unsigned int mpidr) {}
*/
   struct cputopo_arm cpu_topology[NR_CPUS];

+int arch_sd_share_power_line(void)
+{
+   return 0*SD_SHARE_POWERLINE;
+}



Making this selection of policy based on sched domain will better. Just
gives the flexibility to choose a separate scheme for big and little
systems which will be very convenient.


I agree that it would be more flexible to be able to set it for each level


Will you be addressing that in next version then ?

Regards
santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: OMAP4: Basic default configuration for Pandaboard

2012-10-27 Thread Santosh Shilimkar


On Saturday 27 October 2012 05:33 PM, Constantine Shulyupin wrote:

From: Constantine Shulyupin 

Tested SD (MMC) and Ethernet smsc95xx on linux-3.7-rc2

Signed-off-by: Constantine Shulyupin 
--
kernel size is 2.3 MiB
---
  arch/arm/configs/omap4_panda_defconfig |  115 


We got rid of all omap board defconfigs and now just using
omap2plus_defconfig for all OMAP2/3/4/5 machines. If something
is missing from omap2plus_defconfig, which you need to on your
panda custom config, do send patch for that. Otherwise, we are
no longer adding board defconfigs.

Regards
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] gpio: interrupt consistency check for OF GPIO IRQs

2013-08-27 Thread Santosh Shilimkar

On Tuesday 27 August 2013 04:17 PM, Stephen Warren wrote:
> On 08/26/2013 08:07 AM, Lars Poeschel wrote:
>> From: Linus Walleij 
>>
>> Currently the kernel is ambigously treating GPIOs and interrupts
>> from a GPIO controller: GPIOs and interrupts are treated as
>> orthogonal. This unfortunately makes it unclear how to actually
>> retrieve and request a GPIO line or interrupt from a GPIO
>> controller in the device tree probe path.
> 
> I still think that this patch is the wrong approach. Instead, the logic
> should be hooked into gpio_request() and request_irq(). This patch only
> addresses DT, and ignores anything else, hence doesn't seem like the
> right level of abstraction to plug in, since the issue is not related to DT.
> 
On the issue, Rajendra cooked up a patch [1] to handle the issue at
irqdomain level which was shot down after further discussion. It might
be useful for the discussion here. Ofcourse it was also targeted to
solve only DT issue.

Regards,
Santosh

[1]
"[PATCH] irqdomain: Do a gpio_request in case of a gpio irq_chip"
http://www.mail-archive.com/linux-omap@vger.kernel.org/msg83451.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/6] DRIVERS: IRQCHIP: Add support for crossbar IP

2013-10-01 Thread Santosh Shilimkar

On Tuesday 01 October 2013 09:48 AM, Rob Herring wrote:
> On 10/01/2013 06:13 AM, Sricharan R wrote:
>> Hi,
>>
>> On Monday 30 September 2013 08:39 PM, Rob Herring wrote:
>>> On 09/30/2013 08:59 AM, Sricharan R wrote:
 Some socs have a large number of interrupts requests to service
 the needs of its many peripherals and subsystems. All of the interrupt
 requests lines from the subsystems are not needed at the same
 time, so they have to be muxed to the controllers appropriately.
 In such places a interrupt controllers are preceded by an
 IRQ CROSSBAR that provides flexibility in muxing the device interrupt
 requests to the controller inputs.

 This series models the peripheral interrupts that can be routed through
 the crossbar to the GIC as 'routable-irqs'. The routable irqs are added
 in a separate linear domain inside the GIC. The registered routable 
 domain's
 callback are invoked as a part of the GIC's callback, which in turn should
 allocate a free irq line and configure the IP accordingly. So every 
 peripheral
 in the dts files mentions the fixed crossbar number as its interrupt. A 
 free
 gic line for that gets allocated and configured when the peripheral's 
 interrupt
 is mapped.

 The minimal crossbar driver to track and allocate free GIC lines and 
 configure the
 crossbar is added here, along with the DT bindings.
>>> Seems like interrupt-map property is what you need here.
>>>
>>> http://devicetree.org/Device_Tree_Usage#Advanced_Interrupt_Mapping
>>>
>>> Versatile Express also has an example.
>>OK, but the idea was not to tie up the crossbar<->interrupt numbers at the
>>DTS level, but to assign it dynamically during runtime. This was one of 
>> the
>>   comments that came up with first crossbar support patches, which was 
>> assigning a
>>   interrupt line to crossbar number in the DTS and setting it up in crossbar 
>> probe.
> 
> Is there an actual usecase on a single h/w design that you run out of
> interrupts and it is a user decision which interrupts to use?
>
Yes. There are 240 peripheral interrupts connected out of which 160 can
be used in this specific case. 
 
> You could fill in the interrupt-map at run-time. It would have to be
> early (bootloader or early kernel init) and can't be at request_irq time.
>
Well all options are tried before coming up to the $subject solution.
It was suggested by Thomas in the last review.
 
>> https://lkml.org/lkml/2013/7/18/416
>>
>>Since this approach of assigning in DTS was opposed, we moved to IRQCHIP 
>> and
>>that did not go as well. Finally was asked to handle this as a part of 
>> GIC driver with
>>a separate domain.
>>
>>   http://www.spinics.net/lists/linux-omap/msg97085.html
> 
> This has nothing to do with the GIC, so it does not belong there.
> 
Well the router makes connections from peripheral to GIC. Thomas can
better explain it but I think since its doing irq routing for GIC on
a given hardware, I don't see any issue having some generic map/unmap
function in GIC. The actual implementation is still outside of GIC.

Regards,
Sasntosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/6] DRIVERS: IRQCHIP: Add support for crossbar IP

2013-10-01 Thread Santosh Shilimkar

On Tuesday 01 October 2013 10:53 AM, Rob Herring wrote:
> On 10/01/2013 08:57 AM, Santosh Shilimkar wrote:
>> On Tuesday 01 October 2013 09:48 AM, Rob Herring wrote:
>>> On 10/01/2013 06:13 AM, Sricharan R wrote:
>>>> Hi,
>>>>
>>>> On Monday 30 September 2013 08:39 PM, Rob Herring wrote:
>>>>> On 09/30/2013 08:59 AM, Sricharan R wrote:
>>>>>> Some socs have a large number of interrupts requests to service
>>>>>> the needs of its many peripherals and subsystems. All of the interrupt
>>>>>> requests lines from the subsystems are not needed at the same
>>>>>> time, so they have to be muxed to the controllers appropriately.
>>>>>> In such places a interrupt controllers are preceded by an
>>>>>> IRQ CROSSBAR that provides flexibility in muxing the device interrupt
>>>>>> requests to the controller inputs.
>>>>>>
>>>>>> This series models the peripheral interrupts that can be routed through
>>>>>> the crossbar to the GIC as 'routable-irqs'. The routable irqs are added
>>>>>> in a separate linear domain inside the GIC. The registered routable 
>>>>>> domain's
>>>>>> callback are invoked as a part of the GIC's callback, which in turn 
>>>>>> should
>>>>>> allocate a free irq line and configure the IP accordingly. So every 
>>>>>> peripheral
>>>>>> in the dts files mentions the fixed crossbar number as its interrupt. A 
>>>>>> free
>>>>>> gic line for that gets allocated and configured when the peripheral's 
>>>>>> interrupt
>>>>>> is mapped.
>>>>>>
>>>>>> The minimal crossbar driver to track and allocate free GIC lines and 
>>>>>> configure the
>>>>>> crossbar is added here, along with the DT bindings.
>>>>> Seems like interrupt-map property is what you need here.
>>>>>
>>>>> http://devicetree.org/Device_Tree_Usage#Advanced_Interrupt_Mapping
>>>>>
>>>>> Versatile Express also has an example.
>>>>OK, but the idea was not to tie up the crossbar<->interrupt numbers at 
>>>> the
>>>>DTS level, but to assign it dynamically during runtime. This was one of 
>>>> the
>>>>   comments that came up with first crossbar support patches, which was 
>>>> assigning a
>>>>   interrupt line to crossbar number in the DTS and setting it up in 
>>>> crossbar probe.
>>>
>>> Is there an actual usecase on a single h/w design that you run out of
>>> interrupts and it is a user decision which interrupts to use?
>>>
>> Yes. There are 240 peripheral interrupts connected out of which 160 can
>> be used in this specific case. 
> 
> Yes, I understand the SOC connections. That does not answer my question.
> The 240 interrupts are likely to be limited to fewer by board design,
> pinmuxing, etc.
> 
yes limited by different board designs ...

> How do you handle the 161st interrupt request? Will never happen? Just
> rely on the random driver probe ordering?
>
Well the board dts are expected to provide the peripheral support info to 
optimise it.
If a board actually has more than 160 peripherals available then in that
case the 161 interrupt will not be mapped. 
 
>>> You could fill in the interrupt-map at run-time. It would have to be
>>> early (bootloader or early kernel init) and can't be at request_irq time.
>>>
>> Well all options are tried before coming up to the $subject solution.
>> It was suggested by Thomas in the last review.
>>  
>>>> https://lkml.org/lkml/2013/7/18/416
>>>>
>>>>Since this approach of assigning in DTS was opposed, we moved to 
>>>> IRQCHIP and
>>>>that did not go as well. Finally was asked to handle this as a part of 
>>>> GIC driver with
>>>>a separate domain.
>>>>
>>>>   http://www.spinics.net/lists/linux-omap/msg97085.html
>>>
>>> This has nothing to do with the GIC, so it does not belong there.
>>>
>> Well the router makes connections from peripheral to GIC. Thomas can
>> better explain it but I think since its doing irq routing for GIC on
>> a given hardware, I don't see any issue having some generic map/unmap
>> function in GIC. The actual implementation is still outside of GIC.
> 
> I rea

[PATCH] sched_clock: fix postinit no sched_clock function check

2013-10-02 Thread Santosh Shilimkar

The sched_clock code uses 2 levels of function pointers, sched_clock_func()
and read_sched_clock() but the no sched_clock check in postinit() just
checks read_sched_clock().

This leads to kernel falling back to jiffy based sched clock even in
presence of sched_clock_func() which is not desirable.

Fix the postinit() check to avoid the issue. Probably the issue is hidden
so far on most of the arm SOCs because of already existing sched_clock
registrations apart from arch_timer sched_clock. One can reproduce the
issue by just have arch_timer as sched_clock

Cc: Stephen Boyd 
Cc: John Stultz 
Cc: Russell King 
Cc: Will Deacon 
Cc: Rob Herring 
Cc: Thomas Gleixner 

Signed-off-by: Santosh Shilimkar 
---
 kernel/time/sched_clock.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 0b479a6..15c4d78 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -179,7 +179,8 @@ void __init sched_clock_postinit(void)
 * If no sched_clock function has been provided at that point,
 * make it the final one one.
 */
-   if (read_sched_clock == jiffy_sched_clock_read)
+   if ((read_sched_clock == jiffy_sched_clock_read) &&
+   (sched_clock_func == sched_clock_32))
setup_sched_clock(jiffy_sched_clock_read, 32, HZ);
 
sched_clock_poll(sched_clock_timer.data);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sched_clock: fix postinit no sched_clock function check

2013-10-02 Thread Santosh Shilimkar

On Wednesday 02 October 2013 01:09 PM, Will Deacon wrote:
> On Wed, Oct 02, 2013 at 05:55:28PM +0100, Santosh Shilimkar wrote:
>> The sched_clock code uses 2 levels of function pointers, sched_clock_func()
>> and read_sched_clock() but the no sched_clock check in postinit() just
>> checks read_sched_clock().
>>
>> This leads to kernel falling back to jiffy based sched clock even in
>> presence of sched_clock_func() which is not desirable.
>>
>> Fix the postinit() check to avoid the issue. Probably the issue is hidden
>> so far on most of the arm SOCs because of already existing sched_clock
>> registrations apart from arch_timer sched_clock. One can reproduce the
>> issue by just have arch_timer as sched_clock
> 
> Isn't this just an issue with the arch timer driver not calling
> setup_sched_clock? Instead, we munge around with sched_clock_func directly,
> which doesn't appear to be the way anybody else deals with this.
> 
I thought about that option as well but was not sure since even in that case
the check is not complete. We just ensure that function is popullated.

> I'm not sure of the history though, so perhaps there's a reason for this...
> 
Am curious as well.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sched_clock: fix postinit no sched_clock function check

2013-10-02 Thread Santosh Shilimkar

On Wednesday 02 October 2013 01:22 PM, Stephen Boyd wrote:
> On 10/02/13 10:14, Santosh Shilimkar wrote:
>> On Wednesday 02 October 2013 01:09 PM, Will Deacon wrote:
>>> On Wed, Oct 02, 2013 at 05:55:28PM +0100, Santosh Shilimkar wrote:
>>>> The sched_clock code uses 2 levels of function pointers, sched_clock_func()
>>>> and read_sched_clock() but the no sched_clock check in postinit() just
>>>> checks read_sched_clock().
>>>>
>>>> This leads to kernel falling back to jiffy based sched clock even in
>>>> presence of sched_clock_func() which is not desirable.
>>>>
>>>> Fix the postinit() check to avoid the issue. Probably the issue is hidden
>>>> so far on most of the arm SOCs because of already existing sched_clock
>>>> registrations apart from arch_timer sched_clock. One can reproduce the
>>>> issue by just have arch_timer as sched_clock
>>> Isn't this just an issue with the arch timer driver not calling
>>> setup_sched_clock? Instead, we munge around with sched_clock_func directly,
>>> which doesn't appear to be the way anybody else deals with this.
>>>
>> I thought about that option as well but was not sure since even in that case
>> the check is not complete. We just ensure that function is popullated.
> 
> Yes, nothing is actually broken because sched_clock_func() won't try to
> use the jiffy based read_sched_clock() function. I'm not sure we
> actually need this patch besides to remove a useless timer that updates
> the jiffy epoch. Can we wait until my 64-bit sched_clock patch series
> lands in 3.13? It looks like I still need an ack from Will or Catalin on
> the architected timer patch before the clocksource folks pick it up.
> 
Really... I have not created patch out of fun.
Its broken on my keystone machine at least where the sched_clock is
falling back on jiffy based sched_clock even in presence of arch_timer
sched_clock.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sched_clock: fix postinit no sched_clock function check

2013-10-02 Thread Santosh Shilimkar

On Wednesday 02 October 2013 01:48 PM, Will Deacon wrote:
> On Wed, Oct 02, 2013 at 06:42:40PM +0100, Stephen Boyd wrote:
>> On 10/02/13 10:27, Santosh Shilimkar wrote:
>>> Really... I have not created patch out of fun.
>>> Its broken on my keystone machine at least where the sched_clock is
>>> falling back on jiffy based sched_clock even in presence of arch_timer
>>> sched_clock.
>>
>> How is that possible? sched_clock_func is only assigned by
>> arch/arm/kernel/arch_timer.c when the architected timer is detected and
>> sched_clock() in kernel/time/sched_clock.c calls that function pointer
>> unconditionally. The only way I see this happening is if the architected
>> timer rate is zero.
>   ^^
> 
> *cough* CNTFRQ *cough*
> 
:) CNTFRQ as such is fine. I think the below print mis-lead me mostly. 

sched_clock: ARM arch timer >56 bits at 6144kHz, resolution 162ns
sched_clock: 32 bits at 100 Hz, resolution 1000ns, wraps every 4294967286ms

So yes, now the subject patch actually just avoids the jiffy sched_clock()
registration and nothing else. Even without the patch arch_timer sched_clock
will be in use.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Use of drivers/platform and matching include?

2013-10-04 Thread Santosh Shilimkar

On Friday 04 October 2013 12:48 PM, Olof Johansson wrote:
> On Fri, Oct 4, 2013 at 6:22 AM, Greg Kroah-Hartman
>  wrote:
>> On Fri, Oct 04, 2013 at 12:41:28PM +0100, Russell King - ARM Linux wrote:
>>>
>>> So, no, there will be no new drivers under arch/arm.  They must be in the
>>> drivers subtree somewhere.
>>
>> I have no objection with this, and encourage it.
> 
> Ok, so these are some of the requirements as far as I see it:
> 
> * No per-vendor driver dumping ground under drivers/* (i.e. no
> drivers/platform//)
> * No weirdly constructed single-driver directories directly under
> drivers/* (we already have a few and should look at moving those)
> because nothing else fits
> * We need some sort of convention on dependencies. Several of these
> are more libraries than drivers, i.e. we'll have cross-calls for
> things like queue management, resource allocation, etc. So having a
> single location to hold most of these makes sense instead of
> everything cross-depending on everything else.
> 
> Based on the above, how about we create something like
> drivers/resourcemgr to hold these?   I think at least parts of the
> mvebu-mbus driver that ended up under drivers/bus might be a fit to
> move there. The APM queue allocator would likely be a fit, and maybe
> some of the qualcomm stuff. Kumar, what are your thoughts on that?

Slightly different question but relevant to th thread w.r.t the Queue
allocator/manager. We are also interested for TI Keystone SOCs.

Currently we have generic drivers/hwqueue/ with core hwqueue
layer implementing the standard hardware queue descriptor push/pop
and notification mechanisms and then Keystone specific
driver using those core functionalities. I read most of the
networking SOCs has some sort of hwqueues but not sure about
the its implementations. So just thought of bringing
it into the thread discussion to see if hwqueue core layer is
of any interest to other SOCs.

Regards,
Santosh



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/2] ARM: OMAP: Add secure function omap_smc3() which calling instruction smc #1

2013-07-11 Thread Santosh Shilimkar

On Thursday 11 July 2013 03:43 PM, Ивайло Димитров wrote:
>  
> 
> 
> 
> 
>  > Оригинално писмо 
>  >От:  Dave Martin 
>  >Относно: Re: [PATCH v2 1/2] ARM: OMAP: Add secure function omap_smc3() which
>  calling instruction smc #1
>  >До: Pali Rohár 
>  >Изпратено на: Сряда, 2013, Юли 10 20:45:26 EEST
>  >
>  >
>  >On Wed, Jul 10, 2013 at 02:59:04PM +0200, Pali Rohár wrote:
>  >> Other secure functions omap_smc1() and omap_smc2() calling instruction 
> smc #0
>  >> but Nokia RX-51 board needs to call smc #1 for PPA access.
>  >> 
>  >> Signed-off-by: Ivaylo Dimitrov 
>  >> Signed-off-by: Pali Rohár 
>  >> ---
>  >>  arch/arm/mach-omap2/omap-secure.h |1 +
>  >>  arch/arm/mach-omap2/omap-smc.S|   22 +-
>  >>  2 files changed, 22 insertions(+), 1 deletion(-)
>  >> 
>  >> diff --git a/arch/arm/mach-omap2/omap-secure.h 
> b/arch/arm/mach-omap2/omap-secure.h
>  >> index 0e72917..c4586f4 100644
>  >> --- a/arch/arm/mach-omap2/omap-secure.h
>  >> +++ b/arch/arm/mach-omap2/omap-secure.h
>  >> @@ -51,6 +51,7 @@
>  >>  extern u32 omap_secure_dispatcher(u32 idx, u32 flag, u32 nargs,
>  >>   u32 arg1, u32 arg2, u32 arg3, u32 arg4);
>  >>  extern u32 omap_smc2(u32 id, u32 falg, u32 pargs);
>  >> +extern u32 omap_smc3(u32 id, u32 process, u32 flag, u32 pargs);
>  >>  extern phys_addr_t omap_secure_ram_mempool_base(void);
>  >>  extern int omap_secure_ram_reserve_memblock(void);
>  >>  
>  >> diff --git a/arch/arm/mach-omap2/omap-smc.S 
> b/arch/arm/mach-omap2/omap-smc.S
>  >> index f6441c1..5c02b8d 100644
>  >> --- a/arch/arm/mach-omap2/omap-smc.S
>  >> +++ b/arch/arm/mach-omap2/omap-smc.S
>  >> @@ -1,9 +1,11 @@
>  >>  /*
>  >> - * OMAP44xx secure APIs file.
>  >> + * OMAP34xx and OMAP44xx secure APIs file.
>  >>   *
>  >>   * Copyright (C) 2010 Texas Instruments, Inc.
>  >>   * Written by Santosh Shilimkar 
>  >>   *
>  >> + * Copyright (C) 2012 Ivaylo Dimitrov 
>  >> + * Copyright (C) 2013 Pali Rohár 
>  >>   *
>  >>   * This program is free software,you can redistribute it and/or modify
>  >>   * it under the terms of the GNU General Public License version 2 as
>  >> @@ -54,6 +56,24 @@ ENTRY(omap_smc2)
>  >>   ldmfd   sp!, {r4-r12, pc}
>  >>  ENDPROC(omap_smc2)
>  >>  
>  >> +/**
>  >> + * u32 omap_smc3(u32 service_id, u32 process_id, u32 flag, u32 pargs)
>  >> + * Low level common routine for secure HAL and PPA APIs via smc #1
>  >> + * r0 - @service_id: Secure Service ID
>  >> + * r1 - @process_id: Process ID
>  >> + * r2 - @flag: Flag to indicate the criticality of operation
>  >> + * r3 - @pargs: Physical address of parameter list
>  >> + */
>  >> +ENTRY(omap_smc3)
>  >> + stmfd   sp!, {r4-r12, lr}
>  >
>  >You don't need to save/restore r12.  The ABI allows it to be clobbered
>  >across function calls.
>  >
>  >> + mov r12, r0 @ Copy the secure service ID
>  >> + mov r6, #0xff   @ Indicate new Task call
>  >> + dsb
>  >> + dmb
>  >
>  >dsb synchronises a superset of what dmb synchronises, so the dmb here is
>  >not useful.
>  >
>  >In any case, any code calling this must flush the region addressed by
>  >r3 beforehand anyway, which will include a dsb as part of its semantics
>  >-- this is how you call it from rx51_secure_dispatcher().
>  >
>  >So I think the dsb may not be needed here (?)
>  >
>  >Cheers
>  >---Dave
>  >
>  >
> 
> Could be, but I wonder why almost all the kernel code(I am aware of) that 
> uses SMC and is written by TI, is storing r12 and is using both DSB and DMB. 
> See arch/arm/mach-omap2/sleep34xx.S or arch/arm/mach-omap2/omap-smc.S for 
> examples. I'd rather play it safe and leave it that way, unless someone 
> confirms the other code using SMC has extra DSB/DMB instructions too. I 
> wouldn't risk passing invalid/stale data to the Secure Monitor to just save 8 
> bytes and barriers in a performance non-critical code which will be called 
> only a couple of times during the boot-up process. r12 save/restore is a 
> legacy from omap_smc2 in omap-smc.S, so I guess it can go away without much 
> of a trouble.
> 
Dave pointed out about the dsb and r12 to me in another thread. R12 can be 
easily removed
but the DSB's were needed on OMAP for power sequencing. Without those, we have 
seen
many issues. So you can leave the dsb's to be consistent with rest of the code.

Regards
Santosh
P.S: Please sensibly wrap your message while replying. You reply apears like
one single line and I needed to keep scrolling to read it.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC/RFT PATCH 3/5] scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations

2013-07-12 Thread Santosh Shilimkar

DMA bounce limit is the maximum direct DMA'able memory beyond which
bounce buffers has to be used to perform dma operations. SCSI driver
relies on dma_mask but its calculation is based on max_*pfn which
don't have uniform meaning across architectures. So make use of
dma_max_pfn() which is expected to return the DMAable maximum pfn
value across architectures.

Cc: Russell King 
Cc: linux-s...@vger.kernel.org

Signed-off-by: Santosh Shilimkar 
---
 drivers/scsi/scsi_lib.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..e8275fa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1668,7 +1668,7 @@ u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
 
host_dev = scsi_get_device(shost);
if (host_dev && host_dev->dma_mask)
-   bounce_limit = *host_dev->dma_mask;
+   bounce_limit = dma_max_pfn(host_dev) << PAGE_SHIFT;
 
return bounce_limit;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC/RFT PATCH 5/5] ARM: mm: Remove bootmem code and switch to NO_BOOTMEM

2013-07-12 Thread Santosh Shilimkar

In the effort of using memblock instead of bootmem allocator, ARM arch
needs to be converted to use NO_BOOTMEM. With NO_BOOTMEM change,
now we use memblock allocator to reserve space for crash kernel to
have one less dependency with nobootmem allocator wrapper.

Hopefully the NO_BOOTMEM memblock wrapper(nobootmem.c) will vanish in
near future and archs can directly use memblock APIs. Ongoing thread
on this topic is here:
https://lkml.org/lkml/2013/6/29/77

Boot tested with both flat memory and sparse (faked) memory models
with highmem enabled. LAPE systems with memory starting > 4GB still
won't work but this is one of the step to solve that problem for ARM.

Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Nicolas Pitre 
Cc: Tejun Heo 

Signed-off-by: Santosh Shilimkar 
---
 arch/arm/Kconfig|1 +
 arch/arm/kernel/setup.c |2 +-
 arch/arm/mm/init.c  |   58 ++-
 3 files changed, 4 insertions(+), 57 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0ac9be6..cff9a59 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -63,6 +63,7 @@ config ARM
select OLD_SIGSUSPEND3
select OLD_SIGACTION
select HAVE_CONTEXT_TRACKING
+   select NO_BOOTMEM
help
  The ARM series is a line of low-power-consumption RISC chip designs
  licensed by ARM Ltd and targeted at embedded applications and
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 63af9a7..2ca4b90 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -805,7 +805,7 @@ static void __init reserve_crashkernel(void)
if (ret)
return;
 
-   ret = reserve_bootmem(crash_base, crash_size, BOOTMEM_EXCLUSIVE);
+   ret = memblock_reserve(crash_base, crash_size);
if (ret < 0) {
printk(KERN_WARNING "crashkernel reservation failed - "
   "memory is in use (0x%lx)\n", (unsigned long)crash_base);
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 588a2c1..84dd56c 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -153,58 +153,6 @@ static void __init find_limits(unsigned long *min, 
unsigned long *max_low,
*max_high = bank_pfn_end(&mi->bank[mi->nr_banks - 1]);
 }
 
-static void __init arm_bootmem_init(unsigned long start_pfn,
-   unsigned long end_pfn)
-{
-   struct memblock_region *reg;
-   unsigned int boot_pages;
-   phys_addr_t bitmap;
-   pg_data_t *pgdat;
-
-   /*
-* Allocate the bootmem bitmap page.  This must be in a region
-* of memory which has already been mapped.
-*/
-   boot_pages = bootmem_bootmap_pages(end_pfn - start_pfn);
-   bitmap = memblock_alloc_base(boot_pages << PAGE_SHIFT, L1_CACHE_BYTES,
-   __pfn_to_phys(end_pfn));
-
-   /*
-* Initialise the bootmem allocator, handing the
-* memory banks over to bootmem.
-*/
-   node_set_online(0);
-   pgdat = NODE_DATA(0);
-   init_bootmem_node(pgdat, __phys_to_pfn(bitmap), start_pfn, end_pfn);
-
-   /* Free the lowmem regions from memblock into bootmem. */
-   for_each_memblock(memory, reg) {
-   unsigned long start = memblock_region_memory_base_pfn(reg);
-   unsigned long end = memblock_region_memory_end_pfn(reg);
-
-   if (end >= end_pfn)
-   end = end_pfn;
-   if (start >= end)
-   break;
-
-   free_bootmem(__pfn_to_phys(start), (end - start) << PAGE_SHIFT);
-   }
-
-   /* Reserve the lowmem memblock reserved regions in bootmem. */
-   for_each_memblock(reserved, reg) {
-   unsigned long start = memblock_region_reserved_base_pfn(reg);
-   unsigned long end = memblock_region_reserved_end_pfn(reg);
-
-   if (end >= end_pfn)
-   end = end_pfn;
-   if (start >= end)
-   break;
-
-   reserve_bootmem(__pfn_to_phys(start),
-   (end - start) << PAGE_SHIFT, BOOTMEM_DEFAULT);
-   }
-}
-
 #ifdef CONFIG_ZONE_DMA
 
 unsigned long arm_dma_zone_size __read_mostly;
@@ -242,7 +190,7 @@ void __init setup_dma_zone(struct machine_desc *mdesc)
 #endif
 }
 
-static void __init arm_bootmem_free(unsigned long min, unsigned long max_low,
+static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
unsigned long max_high)
 {
unsigned long zone_size[MAX_NR_ZONES], zhole_size[MAX_NR_ZONES];
@@ -396,8 +344,6 @@ void __init bootmem_init(void)
 
find_limits(&min, &max_low, &max_high);
 
-   arm_bootmem_init(min, max_low);
-
/*
 * Sparsemem tries to allocate bootmem in memory_present(),
 * so must be done after the fixed reservations
@@ -414,7 +360,7 @@

[RFC/RFT PATCH 2/5] mm: dma-mapping: Add dma_max_pfn(dev) helper function

2013-07-12 Thread Santosh Shilimkar

Most of the kernel assumes that PFN0 is the start of the physical
memory (RAM). This assumptions is not true on most of the ARM SOCs
and hence and if one try to update the ARM port to follow the assumptions,
we end of breaking the dma bounce limit for few block layer drivers.
One such example is trying to unify the meaning of max*_pfn on ARM
as the bootmem layer expects, breaks few block layer driver dma
bounce limit.

To fix this problem, we introduce dma_max_pfn(dev) generic helper with
a possibility of override from the architecture code. The helper converts
a DMA bitmask of bits to a block PFN number. In all the generic cases,
it is just  "dev->dma_mask >> PAGE_SHIFT" and hence default behavior
is maintained as is.

Subsequent patches will make use of the helper. No functional change.

Cc: Russell King 
Cc: Jens Axboe 

Signed-off-by: Santosh Shilimkar 
---
 include/linux/dma-mapping.h |7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 94af418..68a7863 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -129,6 +129,13 @@ static inline int dma_set_seg_boundary(struct device *dev, 
unsigned long mask)
return -EIO;
 }
 
+#ifndef dma_max_pfn
+static inline unsigned long dma_max_pfn(struct device *dev)
+{
+   return *dev->dma_mask >> PAGE_SHIFT;
+}
+#endif
+
 static inline void *dma_zalloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t flag)
 {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC/RFT PATCH 0/5] mm: ARM nobootmem and few dma_mask fixes

2013-07-12 Thread Santosh Shilimkar

The series is an attempt to move ARM port to NO_BOOTMEM. As discussed
on list NO_BOOTMEM move needed updates to max*pfn meaning to be maximum
PFNs but that breaks the dma_mask for few block layer drivers since
ARM start of physical memory is not PFN0 unlike most of the architectures.
Some more read on it is here:
http://lwn.net/Articles/543408/
http://lwn.net/Articles/543424/

To address this issue, we introduce generic dma_max_pfn() helper which
can be overridden from the architectures.

Another intention behind move to nobootmem is also to convert ARM to
switch to memblock and getting rid of bootmem allocator dependency which
don't work for LPAE machines which has physical memory starting beyond
4 GB boundary. It needs changes to core kernel and also a new memblock
API. More on this can be found here:
https://lkml.org/lkml/2013/6/29/77

I have been trying to cook up these patches with kind help from Russell
and we know series don't solve all the dma_mask bad assumptions. But at
least I am hoping that it can get the ball rolling. 

Comments/testing help is welcome !!

Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Nicolas Pitre 
Cc: Tejun Heo 
Cc: Jens Axboe 

Santosh Shilimkar (5):
  block: Rename parameter dma_mask to max_addr for
blk_queue_bounce_limit()
  mm: dma-mapping: Add dma_max_pfn(dev) helper function
  scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations
  ARM: mm: change max*pfn to include the physical offset of memory
  ARM: mm: Remove bootmem code and switch to NO_BOOTMEM

 arch/arm/Kconfig   |1 +
 arch/arm/include/asm/dma-mapping.h |   16 ++---
 arch/arm/kernel/setup.c|2 +-
 arch/arm/mm/init.c |   68 
 block/blk-settings.c   |8 ++---
 drivers/scsi/scsi_lib.c|2 +-
 include/linux/dma-mapping.h|7 
 7 files changed, 32 insertions(+), 72 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC/RFT PATCH 4/5] ARM: mm: change max*pfn to include the physical offset of memory

2013-07-12 Thread Santosh Shilimkar

Most of the kernel code assumes that max*pfn is maximum pfns because
the physical start of memory is expected to be PFN0. Since this
assumption is not true on ARM architectures, the meaning of max*pfn
is number of memory pages. This is done to keep drivers happy which
are making use of of these variable to calculate the dma bounce limit
using dma_mask.

Now since we have a architecture override possibility for DMAable
maximum pfns, lets make meaning of max*pfns as maximum pnfs on ARM
as well.

In the patch, the dma_to_pfn/pfn_to_dma() pair is hacked to take care of
the physical memory offset. It is done this way just to enable testing
since its understood that it can come in way of single zImage.

Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Nicolas Pitre 

Signed-off-by: Santosh Shilimkar 
---
 arch/arm/include/asm/dma-mapping.h |   16 
 arch/arm/mm/init.c |   10 --
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index 5b579b9..b2d5937 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -47,12 +47,12 @@ static inline int dma_set_mask(struct device *dev, u64 mask)
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-   return (dma_addr_t)__pfn_to_bus(pfn);
+   return (dma_addr_t)__pfn_to_bus(pfn + PHYS_PFN_OFFSET);
 }
 
 static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
-   return __bus_to_pfn(addr);
+   return __bus_to_pfn(addr) - PHYS_PFN_OFFSET;
 }
 
 static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
@@ -64,15 +64,16 @@ static inline dma_addr_t virt_to_dma(struct device *dev, 
void *addr)
 {
return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
 }
+
 #else
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-   return __arch_pfn_to_dma(dev, pfn);
+   return __arch_pfn_to_dma(dev, pfn + PHYS_PFN_OFFSET);
 }
 
 static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
-   return __arch_dma_to_pfn(dev, addr);
+   return __arch_dma_to_pfn(dev, addr) - PHYS_PFN_OFFSET;
 }
 
 static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
@@ -86,6 +87,13 @@ static inline dma_addr_t virt_to_dma(struct device *dev, 
void *addr)
 }
 #endif
 
+/* The ARM override for dma_max_pfn() */
+static inline unsigned long dma_max_pfn(struct device *dev)
+{
+   return dma_to_pfn(dev, *dev->dma_mask);
+}
+#define dma_max_pfn(dev) dma_max_pfn(dev)
+
 /*
  * DMA errors are defined by all-bits-set in the DMA address.
  */
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 6833cbe..588a2c1 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -420,12 +420,10 @@ void __init bootmem_init(void)
 * This doesn't seem to be used by the Linux memory manager any
 * more, but is used by ll_rw_block.  If we can get rid of it, we
 * also get rid of some of the stuff above as well.
-*
-* Note: max_low_pfn and max_pfn reflect the number of _pages_ in
-* the system, not the maximum PFN.
 */
-   max_low_pfn = max_low - PHYS_PFN_OFFSET;
-   max_pfn = max_high - PHYS_PFN_OFFSET;
+   min_low_pfn = min;
+   max_low_pfn = max_low;
+   max_pfn = max_high;
 }
 
 /*
@@ -531,7 +529,7 @@ static inline void free_area_high(unsigned long pfn, 
unsigned long end)
 static void __init free_highpages(void)
 {
 #ifdef CONFIG_HIGHMEM
-   unsigned long max_low = max_low_pfn + PHYS_PFN_OFFSET;
+   unsigned long max_low = max_low_pfn;
struct memblock_region *mem, *res;
 
/* set highmem page free */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC/RFT PATCH 1/5] block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit()

2013-07-12 Thread Santosh Shilimkar

The blk_queue_bounce_limit() API parameter 'dma_mask' is actually the
maximum address the device can handle rather than a dma_mask. Rename
it accordingly to avoid it being interpreted as dma_mask.

No functional change.

The idea is to fix the bad assumptions about dma_mask wherever it could
be miss-interpreted.

Cc: Russell King 
Cc: Jens Axboe 

Signed-off-by: Santosh Shilimkar 
---
 block/blk-settings.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index c50ecf0..026c151 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -195,17 +195,17 @@ EXPORT_SYMBOL(blk_queue_make_request);
 /**
  * blk_queue_bounce_limit - set bounce buffer limit for queue
  * @q: the request queue for the device
- * @dma_mask: the maximum address the device can handle
+ * @max_addr: the maximum address the device can handle
  *
  * Description:
  *Different hardware can have different requirements as to what pages
  *it can do I/O directly to. A low level driver can call
  *blk_queue_bounce_limit to have lower memory pages allocated as bounce
- *buffers for doing I/O to pages residing above @dma_mask.
+ *buffers for doing I/O to pages residing above @max_addr.
  **/
-void blk_queue_bounce_limit(struct request_queue *q, u64 dma_mask)
+void blk_queue_bounce_limit(struct request_queue *q, u64 max_addr)
 {
-   unsigned long b_pfn = dma_mask >> PAGE_SHIFT;
+   unsigned long b_pfn = max_addr >> PAGE_SHIFT;
int dma = 0;
 
q->bounce_gfp = GFP_NOIO;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver

2013-09-18 Thread Santosh Shilimkar

Thomas,

On Friday 13 September 2013 10:55 AM, Santosh Shilimkar wrote:
> On Friday 13 September 2013 10:24 AM, Thomas Gleixner wrote:

[...]

>> Before you dig into MSI, lets talk about irq domains first.
>>
>> GIC implements a legacy irq domain, i.e. a linear domain of all
>> possible GIC interrupts with a 1:1 mapping.
>>
>> So why can't you make use of irq domains and have the whole routing
>> business implemented sanely?
>>
>> What's needed is in gic_init_bases():
>>
>>if (of_property_read(node, "routable_irqs", &nr_routable_irqs) {
>>irq_domain_add_legacy(nr_gic_irqs);
>>} else {
>>irq_domain_add_legacy(nr_per_cpu_irqs);
>>irq_domain_add_linear(nr_routable_irqs);
>>}
>>
>> Now that separate domain has an xlate function which grabs a free GIC
>> irq from a bitmap and returns the hardware irq number in the gic
>> space. The map/unmap callbacks take care of setting up / tearing down
>> the route in the crossbar.
>>
>> Thoughts?
>>
> This sounds pretty good idea. We will explore above option.
> Thanks Thomas.
> 
After further looking into this, the irqdomain approach lets us
setup the map only once during the init. This is similar to
the earlier approach of cross-bar driver where at probe time
the router was setup.  The whole debate started with the fact
that we shouldn't fix the irq mapping at probe and should
dynamically change the mapping based on [request/free]_irq()
to be able to maximize the use of the IP.

Since we have agreed now to move ahead with irdomain, i thought
of mentioning it here.

Regards,
Santosh







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver

2013-09-12 Thread Santosh Shilimkar

Thomas,

On Thursday 12 September 2013 04:18 PM, Thomas Gleixner wrote:
> On Thu, 12 Sep 2013, Sricharan R wrote:
>> Signed-off-by: Sricharan R 
>> ---
>> There is lockdep warning during the boot. This is because we try to
>> do one request_irq with in another and that results in kmalloc being
>> called from an atomic context, which generates the warning.
>> Any suggestions to overcome this will help.
> 
> You can't be serious about that. You post a patch series with a
> serious bug in it instead of asking in the first place how to solve
> the problem at hand.
> 
> So why do you actually need to call request_irq() from inside
> request_irq() and how do you actually manage to do that at all?
> 
> I have a hard time to figure out how request_irq() might call itself
> recursive. And I have no intention to look at your patch to figure out
> that you abuse an irq chip callback to do so, simply because that
> would force me to use abusive language which is frowned upon nowadays.
> 
This is an outcome of the discussion on earlier patch set [1] which
tried to add IRQ event router functionality. From beginning I was
against the idea because I also felt that we are abusing the irqchip
infrastructure and force fitting the cross-bar to be behave like an
irqchip. But Linus W and few others strongly felt it to make it
an irqchip implementation.

> Can you please explain what you are trying to solve without referring
> to your existing broken implementation.
> 
I will try to summaries the IP and its need here. On TI SOCs, we have
an IP cross-bar which is essentially an even router(hardware). It can
route the IRQ and DMA in existing implementation.

Specifically for the IRQ case addressed here, the cross-bar IP
sits between the interrupt controller and peripheral interrupts.

CPU <-- GIC  <- CROSSBAR <- PERIPHERAL IRQs

Just to expand it better, cross-bar input IRQ lines are more than
what a GIC IRQ controller can support.
e.q Total 250 peripheral IRQ lines out of which GIC support
only 160 IRQ lines.

So the idea here is to dynamically map the IRQ lines at
cross-bar level to pick based on request_irq() so that one
can optimize the use of limited IRQ lines at the GIC level.
Strictly speaking the need is just establish the IRQ
connection from peripheral to GIC and thats achieved
at the request_irq() level.

Earlier approach was to statically build this connections
using the DT information in a separate driver probe but
it had limitations of fixing the IRQ map and taking away
flexibility what this IP provide. 

Hope this gives better picture to you behind the patch
series.

Just for others to know, use of IRQCHIP also forced to
have all the irqchip handler redirection which is pure
redirection including the irq-handler. And since it
is *fast path* I asked Sricharan to measure the latency
which is around ~2 uS( 1GHz CPU) overhead for every
interrupt just because of redirections. 

Regards,
Santosh

[1] https://lkml.org/lkml/2013/7/18/362

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver

2013-09-12 Thread Santosh Shilimkar

On Thursday 12 September 2013 06:22 PM, Thomas Gleixner wrote:
> On Thu, 12 Sep 2013, Santosh Shilimkar wrote:
>> Specifically for the IRQ case addressed here, the cross-bar IP
>> sits between the interrupt controller and peripheral interrupts.
>>
>> CPU <-- GIC  <- CROSSBAR <- PERIPHERAL IRQs
>>
>> Just to expand it better, cross-bar input IRQ lines are more than
>> what a GIC IRQ controller can support.
>> e.q Total 250 peripheral IRQ lines out of which GIC support
>> only 160 IRQ lines.
>>
>> So the idea here is to dynamically map the IRQ lines at
>> cross-bar level to pick based on request_irq() so that one
>> can optimize the use of limited IRQ lines at the GIC level.
>> Strictly speaking the need is just establish the IRQ
>> connection from peripheral to GIC and thats achieved
>> at the request_irq() level.
>>
>> Earlier approach was to statically build this connections
>> using the DT information in a separate driver probe but
>> it had limitations of fixing the IRQ map and taking away
>> flexibility what this IP provide. 
>>  
>> Hope this gives better picture to you behind the patch
>> series.
> 
> Yes. I halfways understand what you are trying to achieve.
> 
> So CROSSBAR is a routing block between the peripheral and the GIC in
> order to expand the number of possible interrupts.
> 
> Now the real question is, how that expansion mechanism is supposed to
> work. There are two possible scenarios:
> 
> 1) Expand the number of handled interrupts beyond the GIC capacity:
> 
>That requires a mechanism in CROSSBAR to map several CROSSBAR
>interrupts to a particular GIC interrupt and provide a demux
>mechanism to invoke the shared handlers.
> 
This is not possible in hardware and not supported. Hardware has
no notion of muxing multiple IRQ's to generate 1 IRQ or ack etc
functionality. Its a simple MUX to tie knots between input and output
wires.

> 2) Provide a mapping mechanism between possibly 250 interrupt numbers
>and a limitation of a total 160 active interrupts by the underlying
>GIC.
> 
This is the need and problem we are trying to solve.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver

2013-09-12 Thread Santosh Shilimkar

On Thursday 12 September 2013 08:26 PM, Thomas Gleixner wrote:
> On Thu, 12 Sep 2013, Santosh Shilimkar wrote:
>> On Thursday 12 September 2013 06:22 PM, Thomas Gleixner wrote:
>>> Now the real question is, how that expansion mechanism is supposed to
>>> work. There are two possible scenarios:
>>>
>>> 1) Expand the number of handled interrupts beyond the GIC capacity:
>>>
>>>That requires a mechanism in CROSSBAR to map several CROSSBAR
>>>interrupts to a particular GIC interrupt and provide a demux
>>>mechanism to invoke the shared handlers.
>>>
>> This is not possible in hardware and not supported. Hardware has
>> no notion of muxing multiple IRQ's to generate 1 IRQ or ack etc
>> functionality. Its a simple MUX to tie knots between input and output
>> wires.
> 
> It's not a MUX. It's a ROUTING mechanism. That's similar to the
> mechanisms which are used by MSI[X]. We assign arbitrary interrupt
> numbers to a device and route them to some underlying limited hardware
> interrupt controller.
> 
>>> 2) Provide a mapping mechanism between possibly 250 interrupt numbers
>>>and a limitation of a total 160 active interrupts by the underlying
>>>GIC.
>>>
>> This is the need and problem we are trying to solve.
> 
> Let me summarize:
> 
>- GIC supports up to 160 interrupts
> 
>- CROSSBAR supports up to 250 interrupts 
> 
>- CROSSBAR routes up to 160 out of 250 interrupts to the GIC ones
> 
>- Drivers request a CROSSBAR interrupt number which must be mapped
>  to some arbitrary available GIC irq number
> 
Correct.

> So basically the CROSSBAR mechanism is pretty much the same as MSI[X]
> just in a different flavour and with a different set of semantics and
> limitations, i.e. poor mans MSI[X] with a new level of bogosity.
> 
> So if CROSSBAR is going to be the new fangled SoC MSI[X] long term
> equivalent then you better provide some infrastructure for that and
> make the drivers ready to use it. Maybe check with the PCI/MSI folks
> to share some of the interfaces.
>
> If that whole thing is another onetime HW designers wet dream, then
> please go back to the limited but completely functional (Who is going
> to use more than 160 peripheral interrupts) device tree model. I
> really have no interest to support hardware designer brain farts.
> 
Thanks for clear NAK for irqchip approach. I should have looped you
in the discussion where I was also suggesting against the irqchip
approach. We will try to look at MSI stuff but if its get too
complicated am going to fall-back to the initial probe based
approach to achieve the functionality.

Thanks again for clear direction and useful discussion.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver

2013-09-13 Thread Santosh Shilimkar

On Friday 13 September 2013 10:24 AM, Thomas Gleixner wrote:
> On Thu, 12 Sep 2013, Santosh Shilimkar wrote:
>> On Thursday 12 September 2013 08:26 PM, Thomas Gleixner wrote:
>>> Let me summarize:
>>>
>>>- GIC supports up to 160 interrupts
>>>
>>>- CROSSBAR supports up to 250 interrupts 
>>>
>>>- CROSSBAR routes up to 160 out of 250 interrupts to the GIC ones
>>>
>>>- Drivers request a CROSSBAR interrupt number which must be mapped
>>>  to some arbitrary available GIC irq number
>>>
>> Correct.
>>
>>> So basically the CROSSBAR mechanism is pretty much the same as MSI[X]
>>> just in a different flavour and with a different set of semantics and
>>> limitations, i.e. poor mans MSI[X] with a new level of bogosity.
>>>
>>> So if CROSSBAR is going to be the new fangled SoC MSI[X] long term
>>> equivalent then you better provide some infrastructure for that and
>>> make the drivers ready to use it. Maybe check with the PCI/MSI folks
>>> to share some of the interfaces.
>>>
>>> If that whole thing is another onetime HW designers wet dream, then
>>> please go back to the limited but completely functional (Who is going
>>> to use more than 160 peripheral interrupts) device tree model. I
>>> really have no interest to support hardware designer brain farts.
>>>
>> Thanks for clear NAK for irqchip approach. I should have looped you
>> in the discussion where I was also suggesting against the irqchip
>> approach. We will try to look at MSI stuff but if its get too
>> complicated am going to fall-back to the initial probe based
>> approach to achieve the functionality.
> 
> Before you dig into MSI, lets talk about irq domains first.
> 
> GIC implements a legacy irq domain, i.e. a linear domain of all
> possible GIC interrupts with a 1:1 mapping.
> 
> So why can't you make use of irq domains and have the whole routing
> business implemented sanely?
> 
> What's needed is in gic_init_bases():
> 
>if (of_property_read(node, "routable_irqs", &nr_routable_irqs) {
> irq_domain_add_legacy(nr_gic_irqs);
>} else {
> irq_domain_add_legacy(nr_per_cpu_irqs);
> irq_domain_add_linear(nr_routable_irqs);
>}
> 
> Now that separate domain has an xlate function which grabs a free GIC
> irq from a bitmap and returns the hardware irq number in the gic
> space. The map/unmap callbacks take care of setting up / tearing down
> the route in the crossbar.
> 
> Thoughts?
> 
This sounds pretty good idea. We will explore above option.
Thanks Thomas.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sched_clock: fix postinit no sched_clock function check

2013-10-09 Thread Santosh Shilimkar

On Wednesday 09 October 2013 07:59 PM, John Stultz wrote:
> On 10/02/2013 11:07 AM, Santosh Shilimkar wrote:
>> On Wednesday 02 October 2013 01:48 PM, Will Deacon wrote:
>>> On Wed, Oct 02, 2013 at 06:42:40PM +0100, Stephen Boyd wrote:
>>>> On 10/02/13 10:27, Santosh Shilimkar wrote:
>>>>> Really... I have not created patch out of fun.
>>>>> Its broken on my keystone machine at least where the sched_clock is
>>>>> falling back on jiffy based sched_clock even in presence of arch_timer
>>>>> sched_clock.
>>>> How is that possible? sched_clock_func is only assigned by
>>>> arch/arm/kernel/arch_timer.c when the architected timer is detected and
>>>> sched_clock() in kernel/time/sched_clock.c calls that function pointer
>>>> unconditionally. The only way I see this happening is if the architected
>>>> timer rate is zero.
>>>   ^^
>>>
>>> *cough* CNTFRQ *cough*
>>>
>> :) CNTFRQ as such is fine. I think the below print mis-lead me mostly. 
>>
>> sched_clock: ARM arch timer >56 bits at 6144kHz, resolution 162ns
>> sched_clock: 32 bits at 100 Hz, resolution 1000ns, wraps every 
>> 4294967286ms
>>
>> So yes, now the subject patch actually just avoids the jiffy sched_clock()
>> registration and nothing else. Even without the patch arch_timer sched_clock
>> will be in use.
> 
> Just wanted to follow up here, as I've not been paying close attention.
> Is this issue then resolved, or is something still needed to be queued
> for 3.12/3.13?
> 
There is no regression as I initially thought. Patch fixes the
miss-leading sched_clock print and also prevents timer to handle 
wrapping which is not needed.

So no big deal and I don't mind if we don't apply it.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 0/5] clk: dt: bindings for mux, divider & gate clocks

2013-08-29 Thread Santosh Shilimkar

Mike,

On Thursday 22 August 2013 01:53 AM, Mike Turquette wrote:
> This series introduces binding definitions for common register-mapped
> clock multiplexer, divider and gate IP blocks along with the
> corresponding setup functions for matching DT data.  The bindings are
> similar to the struct definitions but please don't hold that against the
> binding: the struct definitions closely model the hardware register
> layout.
> 
> The delta from v3 is very small and only consists of cosmetic changes
> and one functional change for better error handling.
> 
> Changes since v3:
> * added Tested-by & Reviewed-by from Heiko
> * replaced underscores with dashes in DT property names
> * bail from of clock setup function early if of_iomap fails
> * removed unecessary explict cast
> 
> Mike Turquette (5):
>   clk: divider: replace bitfield width with mask
>   clk: of: helper for determining number of parent clocks
>   clk: dt: binding for basic multiplexer clock
>   clk: dt: binding for basic divider clock
>   clk: dt: binding for basic gate clock
> 
I have used $subject series as the base for the Keystone
ccf support and tested these patches on Keystone EVM. Will
post those patches soon on the lists.

For the series if you need more tested-by tags, feel free to
add,
Tested-by: Santosh Shilimkar 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] gpio/omap: maintain GPIO and IRQ usage separately

2013-09-27 Thread Santosh Shilimkar

On Tuesday 24 September 2013 08:36 PM, Javier Martinez Canillas wrote:
> The GPIO OMAP controller pins can be used as IRQ and GPIO
> independently so is necessary to keep track GPIO pins and
> IRQ lines usage separately to make sure that the bank will
> always be enabled while being used.
> 
> Also move gpio_is_input() definition in preparation for the
> next patch that setups the controller's irq_chip driver when
> a caller requests an interrupt line.
> 
> Signed-off-by: Javier Martinez Canillas 
> ---
Considering GPIO core maintainer is fine with this approach,
am fine with both of your patches. Again thanks a lot for
fixing the long nagging issue.

FWIW,
Acked-by: Santosh Shilimkar 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/6] DRIVERS: IRQCHIP: Add support for crossbar IP

2013-09-30 Thread Santosh Shilimkar

On Monday 30 September 2013 09:59 AM, Sricharan R wrote:
> Some socs have a large number of interrupts requests to service
> the needs of its many peripherals and subsystems. All of the interrupt
> requests lines from the subsystems are not needed at the same
> time, so they have to be muxed to the controllers appropriately.
> In such places a interrupt controllers are preceded by an
> IRQ CROSSBAR that provides flexibility in muxing the device interrupt
> requests to the controller inputs.
> 
> This series models the peripheral interrupts that can be routed through
> the crossbar to the GIC as 'routable-irqs'. The routable irqs are added
> in a separate linear domain inside the GIC. The registered routable domain's
> callback are invoked as a part of the GIC's callback, which in turn should
> allocate a free irq line and configure the IP accordingly. So every peripheral
> in the dts files mentions the fixed crossbar number as its interrupt. A free
> gic line for that gets allocated and configured when the peripheral's 
> interrupt
> is mapped.
> 
> The minimal crossbar driver to track and allocate free GIC lines and 
> configure the
> crossbar is added here, along with the DT bindings.
> 
You should have references to the previous discussions so that its
easier for new reviewers to understand why you ended up the approach.
I noticed you missed this in your last posts as well.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/6] DRIVERS: IRQCHIP: IRQ-GIC: Add support for routable irqs

2013-09-30 Thread Santosh Shilimkar

On Monday 30 September 2013 10:16 AM, Marc Zyngier wrote:
> On 30/09/13 14:59, Sricharan R wrote:
>> In some socs the gic can be preceded by a crossbar IP which
>> routes the peripheral interrupts to the gic inputs. The peripheral
>> interrupts are associated with a fixed crossbar input line and the
>> crossbar routes that to one of the free gic input line.
>>
>> The DT entries for peripherals provides the fixed crossbar input line
>> as its interrupt number and the mapping code should associate this with
>> a free gic input line. This patch adds the support inside the gic irqchip
>> to handle such routable irqs. The routable irqs are registered in a linear
>> domain. The registered routable domain's callback should be implemented
>> to get a free irq and to configure the IP to route it.
> 
> Isn't this just another chained interrupt controller? How is it GIC
> specific?
> 
No it isn't a irq controller rather a event router. Patch is missing
reference to the previous discussion. Previous discussion is here [1]

Regards,
Santosh

[1] https://lkml.org/lkml/2013/9/13/413

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] gpio/omap: auto-setup a GPIO when used as an IRQ

2013-09-23 Thread Santosh Shilimkar

Javier,

On Monday 23 September 2013 01:07 PM, Tony Lindgren wrote:
> * Javier Martinez Canillas  [130923 10:09]:
>> On 09/23/2013 06:45 PM, Tony Lindgren wrote:
>>>
>>> Hmm does this still work for legacy platform data based
>>> drivers that are doing gpio_request() first?
>>>
>>
>> Yes it still work when booting using board files. I tested on my OMAP3 board 
>> and
>> it worked in both DT and legacy booting mode.
> 
> OK great.
>  
>>> And what's the path for clearing things for PM when free_irq()
>>> gets called? It seems that this would leave the GPIO bank
>>> enabled causing a PM regression?
>>>
>>
>> Indeed, I did set bank->mod_usage |= 1 << offset so the bank is enabled if 
>> the
>> device goes to suspended and then resumed but I completely forget about the
>> clearing path when the IRQ is freed.
>>
>> Which makes me think that we should probably maintain two usage variables, 
>> one
>> for GPIO and another one for IRQ and check both of them on the 
>> suspend/resume pm
>> functions.
> 
> Yes that it seems that they should be treated separately.
> 
As discussed on IRC, the patch as such is fine after the mentioned fixup,
I would like to hear back if Linus W/Grant is fine with the approach. Not sure
if I missed the discussion, but the proposed patch is deviation from
traditional method of doing gpio_request() first up to perform other
gpio operations.

Regards,
Santosh


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2] gpio/omap: auto-setup a GPIO when used as an IRQ

2013-09-24 Thread Santosh Shilimkar

On Tuesday 24 September 2013 11:45 AM, Balaji T K wrote:
> On Tuesday 24 September 2013 09:10 PM, Tony Lindgren wrote:
>> * Javier Martinez Canillas  [130924 01:06]:
>>> The OMAP GPIO controller HW requires a pin to be configured in GPIO
>>> input mode in order to operate as an interrupt input. Since drivers
>>> should not be aware of whether an interrupt pin is also a GPIO or not,
>>> the HW should be fully configured/enabled as an IRQ if a driver solely
>>> uses IRQ APIs such as request_irq(), and never calls any GPIO-related
>>> APIs. As such, add the missing HW setup to the OMAP GPIO controller's
>>> irq_chip driver.
>>>
>>> Since this bypasses the GPIO subsystem we have to ensure that another
>>> caller won't be able to request the same GPIO pin that is used as an
>>> IRQ and set its direction as output. Requesting the GPIO and setting
>>> its direction as input is allowed though.
>>
>> Also please mention the regression that this fixes. So far we know
>> that smsc911x for tobi and igep boards in mainline, and also the
> 
>> MMC card detect for omap4 boards.
> Hi Tony,
> 
> Card detect on omap4 board (sdp and panda) is not based on omap gpio,
> so I think this fix is not applicable for omap4.
> 
> Card detect line for SD card goes to power IC on OMAP4 panda and SDP.
> 
I confused Tony mostly. It was OMAP4 SPI based ethernet which uses the
GPIO as an interrupt line.

So for Panda, its Ethernet driver and not MMC.

Regards,
Santosh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1304 matches

Mail list logo