Re: [PATCH] ARM: hw_breakpoint: Enable debug powerdown only if system supports 'has_ossr'
On Friday 15 March 2013 10:30 AM, Will Deacon wrote: > On Thu, Mar 14, 2013 at 01:08:00PM +0530, Santosh Shilimkar wrote: >> Will, > > Hi guys, > > I'm out of the office at the moment and have really terrible connectivity, > so I can't do too much until next week. However, I don't think adding the > has_ossr check is the right fix for this problem. > >> On Wednesday 13 March 2013 05:59 PM, Lokesh Vutla wrote: >>> Hi Dietmar, >>> On Wednesday 13 March 2013 05:35 PM, Dietmar Eggemann wrote: >>>> On 13/03/13 06:52, Lokesh Vutla wrote: >>>>> Commit {9a6eb31 ARM: hw_breakpoint: Debug powerdown support for >>>>> self-hosted >>>>> debug} introduces debug powerdown support for self-hosted debug. >>>>> While merging the patch 'has_ossr' check was removed which >>>>> was needed for hardwares which doesn't support self-hosted debug. >>>>> Pandaboard (A9) is one such hardware and Dietmar's orginial >>>>> patch did mention this issue. >>>>> Without that check on Panda with CPUIDLE enabled, a flood of >>>>> below messages thrown. >>>>> >>>>> [ 3.597930] hw-breakpoint: CPU 0 failed to disable vector catch >>>>> [ 3.597991] hw-breakpoint: CPU 1 failed to disable vector catch > > Ok, so this means that we've taken an undefined instruction exception while > trying to reset the debug registers on the PM_EXIT path. Now, the code there > deals with CPUs that don't have the save/restore registers just fine, so > that shouldn't have anything to do with this problem, particularly if the > bit that is tripping us up is related to clearing vector catch. > Agree. > Furthermore, I was under the impression that hw_breakpoint did actually > work on panda, which implies that a cold boot *does* manage to reset the > registers (can you please confirm this by looking in your dmesg during > boot?). In that case, it seems as though a PM cycle is powering down a > bunch of debug logic that was powered up during boot, and then we trip over > because we can't access the register bank. > Actually it seems to be without PM. Thanks to analysis from Lokesh, the issue can be seen even with just suspend or cpu hotplug. So cold boot as such is fine. > The proper solution to this problem requires us to establish exactly what is > turning off the debug registers, and then having an OMAP PM notifier to > enable it again. Assuming this has always been the case, I expect hardware > debug across PM fails silently with older kernels. > This has been always the case it seems with CPU power cycle. After the CPU is power cycled, 'DBGAUTHSTATUS' reads '0xaa' rather than '0xaf' which means 'DBGEN = 0' and hence code fails to enable monitor mode. This happens on both secure and GP devices and it can not be patched since the secure code is ROM'ed. We didn't notice so far because hw_breakpoint support was not default enabled on OMAP till the multi-platform build. >> I was also wondering whether we should just warn once rather >> than continuous warnings in the notifier. Patch is end of the >> email. > > Could do, but I'd like to see a fix for the real issue before we simply hide > the warnings :) > Agree here too. As evident above, the feature won't work on OMAP4 devices with PM and hence some solution is needed. What you think of below ? >From d74b4264f6a5967b0f7ada96aad77ab0ac30dbed Mon Sep 17 00:00:00 2001 From: Santosh Shilimkar Date: Mon, 18 Mar 2013 11:59:04 +0530 Subject: [PATCH] ARM: hw_breakpoints: Check for CPU debug availability before enabling it CPU debug features like hardware break, watchpoints can be used only when the debug mode is enabled and available for non-secure mode. Hence check 'DBGAUTHSTATUS.DBGEN' before proceeding to enable the features. Thanks to Will for pointers and Lokesh for the analysis of the issue on OMAP4 where after a CPU power cycle, debug mode gets disabled. Cc: Will Deacon Tested-by: Lokesh Vutla Signed-off-by: Santosh Shilimkar --- arch/arm/kernel/hw_breakpoint.c |8 1 file changed, 8 insertions(+) diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c index 96093b7..683a7cf 100644 --- a/arch/arm/kernel/hw_breakpoint.c +++ b/arch/arm/kernel/hw_breakpoint.c @@ -930,6 +930,14 @@ static void reset_ctrl_regs(void *unused) int i, raw_num_brps, err = 0, cpu = smp_processor_id(); u32 val; + /* Check if we have access to CPU debug features */ + ARM_DBG_READ(c7, c14, 6, val); + if ((val & 0x1) == 0) { + pr_warn_once("CPU %d debug is unavailable\n"
Re: [PATCH] ARM: hw_breakpoint: Enable debug powerdown only if system supports 'has_ossr'
On Monday 18 March 2013 08:37 PM, Will Deacon wrote: > Hi Santosh, > > On Mon, Mar 18, 2013 at 06:51:30AM +, Santosh Shilimkar wrote: >> On Friday 15 March 2013 10:30 AM, Will Deacon wrote: >>> Furthermore, I was under the impression that hw_breakpoint did actually >>> work on panda, which implies that a cold boot *does* manage to reset the >>> registers (can you please confirm this by looking in your dmesg during >>> boot?). In that case, it seems as though a PM cycle is powering down a >>> bunch of debug logic that was powered up during boot, and then we trip over >>> because we can't access the register bank. >>> >> Actually it seems to be without PM. Thanks to analysis from Lokesh, the issue >> can be seen even with just suspend or cpu hotplug. So cold boot as such is >> fine. > > Right, so what you're saying is that monitor-mode hardware debug only works > until the first pm/suspend/hotplug operation, then it's dead in the water? > >>> The proper solution to this problem requires us to establish exactly what is >>> turning off the debug registers, and then having an OMAP PM notifier to >>> enable it again. Assuming this has always been the case, I expect hardware >>> debug across PM fails silently with older kernels. >>> >> This has been always the case it seems with CPU power cycle. >> After the CPU is power cycled, 'DBGAUTHSTATUS' reads '0xaa' rather >> than '0xaf' which means 'DBGEN = 0' and hence code fails to enable >> monitor mode. This happens on both secure and GP devices and it can not >> be patched since the secure code is ROM'ed. We didn't notice so far >> because hw_breakpoint support was not default enabled on OMAP till the >> multi-platform build. > > That really sucks :( Does this affect all OMAP-based boards? > All OMAP4 based boards.. >>>> I was also wondering whether we should just warn once rather >>>> than continuous warnings in the notifier. Patch is end of the >>>> email. >>> >>> Could do, but I'd like to see a fix for the real issue before we simply hide >>> the warnings :) >>> >> Agree here too. As evident above, the feature won't work on OMAP4 >> devices with PM and hence some solution is needed. >> >> What you think of below ? > > Comments inline... > >> >> From d74b4264f6a5967b0f7ada96aad77ab0ac30dbed Mon Sep 17 00:00:00 2001 >> From: Santosh Shilimkar >> Date: Mon, 18 Mar 2013 11:59:04 +0530 >> Subject: [PATCH] ARM: hw_breakpoints: Check for CPU debug availability before >> enabling it >> >> CPU debug features like hardware break, watchpoints can be used only when >> the debug mode is enabled and available for non-secure mode. >> >> Hence check 'DBGAUTHSTATUS.DBGEN' before proceeding to enable the >> features. >> >> Thanks to Will for pointers and Lokesh for the analysis of the issue on >> OMAP4 where after a CPU power cycle, debug mode gets disabled. >> >> Cc: Will Deacon >> >> Tested-by: Lokesh Vutla >> Signed-off-by: Santosh Shilimkar >> --- >> arch/arm/kernel/hw_breakpoint.c |8 >> 1 file changed, 8 insertions(+) >> >> diff --git a/arch/arm/kernel/hw_breakpoint.c >> b/arch/arm/kernel/hw_breakpoint.c >> index 96093b7..683a7cf 100644 >> --- a/arch/arm/kernel/hw_breakpoint.c >> +++ b/arch/arm/kernel/hw_breakpoint.c >> @@ -930,6 +930,14 @@ static void reset_ctrl_regs(void *unused) >> int i, raw_num_brps, err = 0, cpu = smp_processor_id(); >> u32 val; >> >> +/* Check if we have access to CPU debug features */ >> +ARM_DBG_READ(c7, c14, 6, val); >> +if ((val & 0x1) == 0) { >> +pr_warn_once("CPU %d debug is unavailable\n", cpu); >> +cpumask_or(&debug_err_mask, &debug_err_mask, cpumask_of(cpu)); >> +return; >> +} > > There are a few of problems here: > > 1. That is only checking for non-secure access, which precludes > running Linux in secure mode. > We can check bit 4 as well to take care linux running in secure mode. > 2. DBGAUTHSTATUS accesses are UNPREDICTABLE when the double-lock is > set for v7.1 processors. > > 3. DBGAUTHSTATUS doesn't exist in ARMv6. > We cans skip the code for these versions like... if (!ARM_DEBUG_ARCH_V7_ECP14 == get_debug_arch()) goto skip_dbgsts_read; > 4. CPUs without the security extensions ha
Re: [PATCH] ARM: hw_breakpoint: Enable debug powerdown only if system supports 'has_ossr'
On Monday 18 March 2013 10:36 PM, Will Deacon wrote: > On Mon, Mar 18, 2013 at 03:46:28PM +0000, Santosh Shilimkar wrote: >> On Monday 18 March 2013 08:37 PM, Will Deacon wrote: >>> That really sucks :( Does this affect all OMAP-based boards? >>> >> All OMAP4 based boards.. > > Brilliant. Is there any way that the secure code can be fixed in future > products? > Nope. This can only be done with new silicon rev for GP devices which is not going to happen. For secure devices, some secure patching is possible but these are not development devices, so not much point. >>>> + /* Check if we have access to CPU debug features */ >>>> + ARM_DBG_READ(c7, c14, 6, val); >>>> + if ((val & 0x1) == 0) { >>>> + pr_warn_once("CPU %d debug is unavailable\n", cpu); >>>> + cpumask_or(&debug_err_mask, &debug_err_mask, cpumask_of(cpu)); >>>> + return; >>>> + } >>> >>> There are a few of problems here: >>> >>> 1. That is only checking for non-secure access, which precludes >>>running Linux in secure mode. >>> >> We can check bit 4 as well to take care linux running in secure mode. > > It still doesn't help unless we know whether Linux is running secure or > non-secure. > ok. >>> 2. DBGAUTHSTATUS accesses are UNPREDICTABLE when the double-lock is >>>set for v7.1 processors. >>> >>> 3. DBGAUTHSTATUS doesn't exist in ARMv6. >>> >> We cans skip the code for these versions like... >> if (!ARM_DEBUG_ARCH_V7_ECP14 == get_debug_arch()) >> goto skip_dbgsts_read; > > Sure, I was just pointing out that the code needs fixing for this. > >>> 4. CPUs without the security extensions have DBGAUTHSTATUS.NSE == 0 >>> >> Which bit is that ? I don't find this bit in Cortex-A9 docs. With all >> these checks, may be a separate function like 'is_debug_available()' >> would be better. > > NSE is bit 0 (the one you're checking). > ok. So the subject patch might break those devices. >> >>> 5. Accessing DBGAUTHSTATUS requires DBGEN to be driven high. >>>Assuming that your pr_warn_once is emitted, this implies that >>>DBGEN is driven high from cold boot, yet the NSE bit is low, >>>implying that DBGEN is also low. That's contradictory, so I have >>>no idea what's going on... >>> >> Me neither. The warning is emitted at least. > > Any chance you could follow up with your firmware/hardware guys about this > please? I'd really like to understand how we end up in this state in case we > can do something in the architecture to stop it from happening in future. > I will check on this part and update you when I hear from them. >>> Apart from that, it's fine! >>> >> If you agree, I can update patch accordingly. >> BTW, do you have any better trick to take care of >> above scenraio ? > > Well, we could just add the warn_once prints but that doesn't stop debug > from breaking after the first pm/suspend/hotplug operation. > Probably we should drop the $subject patch approach and use warn_once at least for current rc. Ofcourse it doesn't help to get working hw_breakpoint support. Patch end of the email with warn_once. Regards, Santosh >From 6611d48eb5571e3e094c7a9c2479e652b37d35e3 Mon Sep 17 00:00:00 2001 From: Santosh Shilimkar Date: Tue, 19 Mar 2013 11:53:41 +0530 Subject: [PATCH] ARM: hw_breakpoints: Use warn_once to avoid debug message flood from reset_ctrl_regs() CPU debug features like hardware break, watchpoints can be used only when the debug mode is enabled and available. Unfortunately on OMAP4 based device, after a CPU power cycle, the debug feature gets disabled which leads to a flood of messages coming from reset_ctrl_regs() which gets called on every CPU_PM_EXIT with CPUidle enabled. So make use of warn_once() so that system is usable. Thanks to Will for pointers and Lokesh for the analysis of the issue. Cc: Will Deacon Tested-by: Lokesh Vutla Signed-off-by: Santosh Shilimkar --- arch/arm/kernel/hw_breakpoint.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c index 96093b7..5dc1aa6 100644 --- a/arch/arm/kernel/hw_breakpoint.c +++ b/arch/arm/kernel/hw_breakpoint.c @@ -966,7 +966,7 @@ static void reset_ctrl_regs(void *unused) } if (err) { - pr_warning("CPU %d debug is powered down!\n", cpu); + pr_warn_once("CPU %d debug is pow
Re: [PATCH 0/2] omap serial fix.
On Monday 18 March 2013 04:24 PM, Sourav Poddar wrote: > The first patch is a rearrangement of a macro "OMAP_MAX_HSUART_PORTS" > to a header file so that it can be used in other file. > The second patch fixes the wakeup from uart issue while using > "no_console_suspend" in the > bootargs. > > These patches are developed on 3.8 custom kernel containing omap5 > supend/resume support. > > Cc: Santosh Shilimkar Thanks Sourav for sorting out the issue. With update of changelog suggested by Kevin on patch 1, Feel free to add, Acked-Tested-by: Santosh Shilimkar > Sourav Poddar (2): > driver: serial-omap: move max uart count into generic header file. > arm: mach-omap2: prevent UART console idle on suspend while using > "no_console_suspend" > > arch/arm/mach-omap2/omap_device.c | 36 > - > drivers/tty/serial/omap-serial.c |2 - > include/linux/platform_data/serial-omap.h |1 + > 3 files changed, 36 insertions(+), 3 deletions(-) > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 03/10] gpio: gpio-omap.c: fix checkpatch error
On Wednesday 20 March 2013 05:45 PM, Laurent Navet wrote: > Fix : > gpio/gpio-omap.c:697: ERROR: space required before the open parenthesis '(' > > Signed-off-by: Laurent Navet > --- Acked-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 03/10] gpio: gpio-omap.c: fix checkpatch error
On Wednesday 20 March 2013 05:54 PM, Santosh Shilimkar wrote: > On Wednesday 20 March 2013 05:45 PM, Laurent Navet wrote: >> Fix : >> gpio/gpio-omap.c:697: ERROR: space required before the open parenthesis '(' >> >> Signed-off-by: Laurent Navet >> --- > Acked-by: Santosh Shilimkar > Minor suggestion. You might want to use $subject like "coding style fixes". No strong opinion though. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv2 0/2] add omap mcspi device tree data
On Monday 04 February 2013 02:21 PM, Sourav Poddar wrote: Patch series adds omap5 evm mcspi nodes and pinctrl data in omap5.dtsi and omap5-evm.dts files. Felipe Balbi (1): arm: dts: omap5: add SPI devices to OMAP5 DeviceTree file Sourav Poddar (1): arm: dts: omap5-evm: Add mcspi data arch/arm/boot/dts/omap5-evm.dts | 46 +++ arch/arm/boot/dts/omap5.dtsi| 40 + 2 files changed, 86 insertions(+), 0 deletions(-) Acked-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 1/3] ARM: vmregion: remove vmregion code entirely
On Tuesday 05 February 2013 06:01 AM, Joonsoo Kim wrote: Now, there is no user for vmregion. So remove it. Acked-by: Nicolas Pitre Signed-off-by: Joonsoo Kim diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index 8a9c4cb..4e333fa 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -6,7 +6,7 @@ obj-y := dma-mapping.o extable.o fault.o init.o \ iomap.o obj-$(CONFIG_MMU) += fault-armv.o flush.o idmap.o ioremap.o \ - mmap.o pgd.o mmu.o vmregion.o + mmap.o pgd.o mmu.o ifneq ($(CONFIG_MMU),y) obj-y += nommu.o diff --git a/arch/arm/mm/vmregion.c b/arch/arm/mm/vmregion.c deleted file mode 100644 index a631016..000 --- a/arch/arm/mm/vmregion.c +++ /dev/null @@ -1,205 +0,0 @@ You might want to use 'git format-patch -D' which will just generate one line for a deleted file. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 0/3] introduce static_vm for ARM-specific static mapped area
On Tuesday 05 February 2013 06:01 AM, Joonsoo Kim wrote: In current implementation, we used ARM-specific flag, that is, VM_ARM_STATIC_MAPPING, for distinguishing ARM specific static mapped area. The purpose of static mapped area is to re-use static mapped area when entire physical address range of the ioremap request can be covered by this area. This implementation causes needless overhead for some cases. For example, assume that there is only one static mapped area and vmlist has 300 areas. Every time we call ioremap, we check 300 areas for deciding whether it is matched or not. Moreover, even if there is no static mapped area and vmlist has 300 areas, every time we call ioremap, we check 300 areas in now. If we construct a extra list for static mapped area, we can eliminate above mentioned overhead. With a extra list, if there is one static mapped area, we just check only one area and proceed next operation quickly. In fact, it is not a critical problem, because ioremap is not frequently used. But reducing overhead is better idea. Another reason for doing this work is for removing vm_struct list management, entirely. For more information, look at the following link. http://lkml.org/lkml/2012/12/6/184 [..] Joonsoo Kim (3): ARM: vmregion: remove vmregion code entirely ARM: ioremap: introduce an infrastructure for static mapped area ARM: mm: use static_vm for managing static mapped areas arch/arm/mm/Makefile |2 +- arch/arm/mm/ioremap.c | 135 +-- arch/arm/mm/mm.h | 12 +++ arch/arm/mm/mmu.c | 34 arch/arm/mm/vmregion.c | 205 arch/arm/mm/vmregion.h | 31 6 files changed, 123 insertions(+), 296 deletions(-) delete mode 100644 arch/arm/mm/vmregion.c delete mode 100644 arch/arm/mm/vmregion.h Nice Clean-up. I tested this series on OMAP which uses few static mappings. Feel free to add, Tested-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 4/6] ARM: dts: omap5: add dwc3 omap dt data
On Tuesday 05 February 2013 02:16 PM, kishon wrote: Hi, On Tuesday 05 February 2013 02:08 PM, Felipe Balbi wrote: On Tue, Feb 05, 2013 at 01:59:26PM +0530, kishon wrote: On Sunday 27 January 2013 01:17 AM, Sergei Shtylyov wrote: Hello. On 25-01-2013 15:11, Kishon Vijay Abraham I wrote: Add dwc3 omap glue data to the omap5 dt data file. The information about the dt node added here is available @ Documentation/devicetree/bindings/usb/omap-usb.txt Signed-off-by: Kishon Vijay Abraham I --- arch/arm/boot/dts/omap5.dtsi | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/arm/boot/dts/omap5.dtsi b/arch/arm/boot/dts/omap5.dtsi index 5f59bf2..1703a72 100644 --- a/arch/arm/boot/dts/omap5.dtsi +++ b/arch/arm/boot/dts/omap5.dtsi @@ -513,6 +513,17 @@ ti,type = <2>; }; +omap_dwc3@4a02 { +compatible = "ti,dwc3"; +ti,hwmods = "usb_otg_ss"; +reg = <0x4a02 0x1ff>; Shoudn't the "reg" length be 0x200 here? It's length, not limit. I think 0x1ff is correct. I got the data from hwmod data. hwmod is utterly wrong. Looking at TRM, it says the size here is 64KiB (0x1), so is the size for dwc3 itself. Please don't blindly trust hwmod, make sure you read data from TRM ;-) hmm..ok. But it has only 17 registers :-D As Felipe said, it should be 0x200. And if you are interested in lesser space, there is no need to map entire 64 KB address space if it isn't being used. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv3 4/4] arm: Add generic timer broadcast support
Mark, On Thursday 07 February 2013 03:34 PM, Mark Rutland wrote: Hi Stephen, Sorry about this; I'm to blame for the bug. On Wed, Feb 06, 2013 at 08:51:43PM +, Stephen Warren wrote: On 01/14/2013 10:05 AM, Mark Rutland wrote: Implement timer_broadcast for the arm architecture, allowing for the use of clock_event_device_drivers decoupled from the timer tick broadcast mechanism. Mark, this patch is now in next-20130206 and causes a crash during boot on Tegra. The reason appears to be because of: diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c @@ -524,7 +524,6 @@ static void __cpuinit percpu_timer_setup(void) struct clock_event_device *evt = &per_cpu(percpu_clockevent, cpu); evt->cpumask = cpumask_of(cpu); - evt->broadcast = smp_timer_broadcast; After that change, evt->broadcast is never assigned, and hence is NULL. Yet elsewhere in kernel/time/tick-broadcast.c it's used unconditionally: static void tick_do_broadcast(struct cpumask *mask) ... if (!cpumask_empty(mask)) { ... td = &per_cpu(tick_cpu_device, cpumask_first(mask)); td->evtdev->broadcast(mask); Now perhaps the Tegra timer driver simply isn't being set up correctly, so the bug is there... But the only other place I can find where ->broadcast is assigned is in tick_device_uses_broadcast() which only does it for "non-functional" timers, which doesn't apply to Tegra's timer. The intent of 12ad100046: "clockevents: Add generic timer broadcast function" was to setup the broadcast function both for non-functional/dummy timers and those that stop in low-power states (CLOCK_EVT_FEAT_C3STOP). I missed the CLOCK_EVT_FEAT_C3STOP case. I am not sure this exactly the case. Because in my testing, the C3STOP path was exercised already. And if the C3STOP is used then notifiers calls are expected for switching of clock-events to broadcast mode. And dummy broad-cast hook should come into picture only if the per CPU local timer clock-event are not registered. I just tried "next-20130207" on OMAP and I see the broad-cast mechanism works and didn't observe any crash. -- Tick Device: mode: 1 Broadcast device Clock Event Device: gp_timer max_delta_ns: 131071523464981 min_delta_ns: 91552 mult: 70369 shift: 31 mode: 3 next_event: 89984375000 nsecs set_next_event: omap2_gp_timer_set_next_event set_mode: omap2_gp_timer_set_mode event_handler: tick_handle_oneshot_broadcast retries:0 tick_broadcast_mask: 0003 tick_broadcast_oneshot_mask: Tick Device: mode: 1 Per CPU device: 0 Clock Event Device: local_timer max_delta_ns: 10737418240 min_delta_ns: 1000 mult: 858993459 shift: 31 mode: 3 next_event: 12525000 nsecs set_next_event: twd_set_next_event set_mode: twd_set_mode event_handler: hrtimer_interrupt retries:346 Tick Device: mode: 1 Per CPU device: 1 Clock Event Device: local_timer max_delta_ns: 10737418240 min_delta_ns: 1000 mult: 858993459 shift: 31 mode: 3 next_event: 89921875000 nsecs set_next_event: twd_set_next_event set_mode: twd_set_mode event_handler: hrtimer_interrupt retries:258 # -- I believe the patch below will fix this for Tegra and any other platforms where broadcast is required in low power states. Am not sure you really need that patch unless and until am missing a scenario in my test. Stephan, Do we have crash log seen on Tegra ? Which tegra CPUIDLE driver is being used when crash is seen ? Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv3 4/4] arm: Add generic timer broadcast support
On Thursday 07 February 2013 10:21 PM, Stephen Warren wrote: On 02/07/2013 04:40 AM, Santosh Shilimkar wrote: [...] Stephan, Do we have crash log seen on Tegra ? Which tegra CPUIDLE driver is being used when crash is seen ? The crash log is below. It's simply because td->evt->broadcast is NULL because it wasn't ever set. Thanks for the log. Its clear from the log that broadcast event wasn't set. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCHv3 4/4] arm: Add generic timer broadcast support
Mark, On Thursday 07 February 2013 10:47 PM, Mark Rutland wrote: [..] I believe the patch below will fix this for Tegra and any other platforms where broadcast is required in low power states. From the Stephan's crash the issue is pretty clear and your patch make sense. Some how I didn't manage to reproduce the issue on my board and hence assumed everything is just fine. Will have a look at it later why it didn't explode since it was so obvious. Sorry for the noise. Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] dmaengine: omap-dma: Start DMA without delay for cyclic channels
On Wednesday 03 April 2013 04:47 PM, Peter Ujfalusi wrote: > cyclic DMA is only used by audio which needs DMA to be started without a > delay. > If the DMA for audio is started using the tasklet we experience random > channel switch (to be more precise: channel shift). > > Reported-by: Peter Meerwald > Signed-off-by: Peter Ujfalusi > --- > Hi Russell, > > Instead of removing the tasklet we can identify the DMA channel used by audio > based on the cyclic flag of the channel. > I think this can be used as a short term solution to fix the audio channel > shift > issue and later when we have the dynamic DMA channel allocation we can adjust > the code. > > Regards, > Peter > > drivers/dma/omap-dma.c | 20 ++-- > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/drivers/dma/omap-dma.c b/drivers/dma/omap-dma.c > index 2ea3d7e..ec3fc4f 100644 > --- a/drivers/dma/omap-dma.c > +++ b/drivers/dma/omap-dma.c > @@ -282,12 +282,20 @@ static void omap_dma_issue_pending(struct dma_chan > *chan) > > spin_lock_irqsave(&c->vc.lock, flags); > if (vchan_issue_pending(&c->vc) && !c->desc) { If you add "!c->cyclic" in above if then you can avoid indentation change and just have else for cyclic case. > - struct omap_dmadev *d = to_omap_dma_dev(chan->device); > - spin_lock(&d->lock); > - if (list_empty(&c->node)) > - list_add_tail(&c->node, &d->pending); > - spin_unlock(&d->lock); > - tasklet_schedule(&d->task); > + /* > + * c->cyclic is used only by audio and in this case the DMA need > + * to be started without delay. > + */ > + if (!c->cyclic) { > + struct omap_dmadev *d = to_omap_dma_dev(chan->device); > + spin_lock(&d->lock); > + if (list_empty(&c->node)) > + list_add_tail(&c->node, &d->pending); > + spin_unlock(&d->lock); > + tasklet_schedule(&d->task); > + } else { > + omap_dma_start_desc(c); > + } > } > spin_unlock_irqrestore(&c->vc.lock, flags); > } > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] dmaengine: omap-dma: Start DMA without delay for cyclic channels
On Wednesday 03 April 2013 05:22 PM, Peter Ujfalusi wrote: > On 04/03/2013 01:24 PM, Santosh Shilimkar wrote: >> On Wednesday 03 April 2013 04:47 PM, Peter Ujfalusi wrote: >>> cyclic DMA is only used by audio which needs DMA to be started without a >>> delay. >>> If the DMA for audio is started using the tasklet we experience random >>> channel switch (to be more precise: channel shift). >>> >>> Reported-by: Peter Meerwald >>> Signed-off-by: Peter Ujfalusi >>> --- >>> Hi Russell, >>> >>> Instead of removing the tasklet we can identify the DMA channel used by >>> audio >>> based on the cyclic flag of the channel. >>> I think this can be used as a short term solution to fix the audio channel >>> shift >>> issue and later when we have the dynamic DMA channel allocation we can >>> adjust >>> the code. >>> >>> Regards, >>> Peter >>> >>> drivers/dma/omap-dma.c | 20 ++-- >>> 1 file changed, 14 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/dma/omap-dma.c b/drivers/dma/omap-dma.c >>> index 2ea3d7e..ec3fc4f 100644 >>> --- a/drivers/dma/omap-dma.c >>> +++ b/drivers/dma/omap-dma.c >>> @@ -282,12 +282,20 @@ static void omap_dma_issue_pending(struct dma_chan >>> *chan) >>> >>> spin_lock_irqsave(&c->vc.lock, flags); >>> if (vchan_issue_pending(&c->vc) && !c->desc) { >> If you add "!c->cyclic" in above if then you can avoid >> indentation change and just have else for cyclic case. > > It can not be embedded there because of the existing tests. How would we > handle the case when c->desc is _not_ NULL and c->cyclic is false? We would > need to test again in else, but we can not do this for the > vchan_issue_pending(&c->vc). > right. Thanks for clarifying it. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH/Resend 2/2] arm: mach-omap2: prevent UART console idle on suspend while using "no_console_suspend"
On Friday 05 April 2013 12:38 PM, Sourav Poddar wrote: > Hi Kevin, > On Wednesday 03 April 2013 11:18 PM, Kevin Hilman wrote: >> Sourav Poddar writes: >> >>> Hi Kevin, >>> On Wednesday 20 March 2013 05:36 PM, Sourav Poddar wrote: >>>> Realised the list to whom the patch was send got dropped. Ccing >>>> them all.. >>>> On Wednesday 20 March 2013 05:18 PM, Sourav Poddar wrote: >>>>> Hi Kevin, >>>>> On Tuesday 19 March 2013 12:24 AM, Kevin Hilman wrote: >>>>>> Sourav Poddar writes: >>>>>> >>>>>>> With dt boot, uart wakeup after suspend is non functional on >>>>>>> omap4/5 while using >>>>>>> "no_console_suspend" in the bootargs. With "no_console_suspend" >>>>>>> used, od->flags >>>>>>> should be ORed with "OMAP_DEVICE_NO_IDLE_ON_SUSPEND", thereby not >>>>>>> allowing the console >>>>>>> to idle in the suspend path. For non-dt case, this was taken care >>>>>>> by platform data. >>>>>>> >>>>>>> Tested on omap5430evm, omap4430sdp. >>>>>>> >>>>>>> Cc: Santosh Shilimkar >>>>>>> Cc: Felipe Balbi >>>>>>> Cc: Rajendra nayak >>>>>>> Signed-off-by: Sourav Poddar >>>>>> This patch creates a dependency between omap_device (generic, >>>>>> device-independent code) and a specific driver (UART.) >>>>>> >>>>>> If you need to do something like this that's DT boot specific, then >>>>>> we probably need some late initcall in serial.c to handle this. >>>>>> It does >>>>>> not belong in omap_device. >>>>>> >>>>> The following function "omap_device_disable_idle_on_suspend(pdev)" >>>>> should only >>>>> be called once the omap device has been build, which in the case of >>>>> device tree is >>>>> done in omap_device.c file. Moreover, the above call should be >>>>> executed conditionally >>>>> and should depend on the following two parameter. >>>>> >>>>> [1] a. Whether "no_console_suspend" is set and >>>>> b. the device build is a console uart. >>>>> >>>>> When I look closely into the serial.c file, I realised that >>>>> "core_initcall(omap_serial_early_init)" gets called irrespective >>>>> of dt/non dt boot and will take care of most of the stuff(checking >>>>> whether >>>>> "no_console_suspend" is used and which uart is used as a console >>>>> uart) which the >>>>> $subject patch is proposing. >>>>> >>>>> But the problem is that we need to exchange the parsed information >>>>> from serial.c to the omap_device file for the condtional execution of >>>>> "omap_device_disable_idle_on_suspend" >>>>> >>>>> In this case, >>>>> from "serial.c" we need >>>>> 1. no_console_suspend = true >>>>> 2. strcpy(console_name, oh_name), where oh_name corresponds to >>>>> the console uart. >>>>> >>>>> then in "omap_device.c" do >>>>> if (no_console_suspend&& !strcmp(oh->name, console_name)) >>>>> omap_device_disable_idle_on_suspend(pdev); >>>>> >>>>> Please correct if I am understanding it incorrectly. >>>>> >>>>> If the above understanding looks good to you, is there a way we can >>>>> make this >>>>> exchange of information happen between serial.c and omap_device.c file? >>> Any input on this? >>> As I explained earlier, that there is a need to parse information in >>> serial.c and use that in >>> omap_device.c only after the device is build. >> As I explained earlier, any device specific hacks inside omap_device >> should be a red flag that something has gone wrong. >> >> How about fixing the UART driver/core to not runtime suspend if >> no_console_suspend is given? >> >> Then we can get rid of this no_idle_on_suspend hack all together since >> UART is the only remaining user. >> > > Yes, that can be done. > > I cooked up an experimental p
Re: One of these things (CONFIG_HZ) is not like the others..
Jon, On Tuesday 29 January 2013 05:31 AM, John Stultz wrote: On 01/27/2013 10:08 PM, Santosh Shilimkar wrote: On Tuesday 22 January 2013 08:35 PM, Santosh Shilimkar wrote: On Tuesday 22 January 2013 08:21 PM, Russell King - ARM Linux wrote: On Tue, Jan 22, 2013 at 03:44:03PM +0530, Santosh Shilimkar wrote: [..] Thanks for expanding it. It is really helpful. And I think further discussion is pointless until such research has been done (or someone who _really_ knows the time keeping/timer/sched code inside out comments.) Fully agree about experimentation to re-asses the drift. From what I recollect from past, few OMAP customers did report the time drift issue and that is how the switch from 100 --> 128 happened. Anyway I have added the suggested task to my long todo list. So I tried to see if any time drift with HZ = 100 on OMAP. I ran the setup for 62 hours and 27 mins with time synced up once with NTP server. I measure about ~174 millisecond drift which is almost noise considering the observed duration was ~22482 milliseconds. So 174ms drift doesn't sound great, as < 2ms (often much less - though that depends on how close the server is) can be expected with NTP. Although its not clear how you were measuring: Did you see a max 174ms offset while trying to sync with NTP? Was that offset shortly after starting NTP or after NTP converged down? To avoid the server latency, we didn't do continuous sync. The time was synced in the beginning and after 62.5 hours (#ntpd -qg) and the drift of about 174 ms was observed. As you said this could be because of server sync time along with probably some addition from system calls from #ntpd. As mentioned, the other run with HZ = 128 which started 15 hours 20 mins is already showing about 24 mS drift now. I will let it run for couple of more days just to have similar duration run. Regards, santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/2] ARM: kernel: update cpuinfo to print CPU model name
On Wednesday 30 January 2013 04:42 AM, Ruslan Bilovol wrote: Hi, On Tue, Jan 29, 2013 at 6:08 PM, Russell King - ARM Linux wrote: On Tue, Jan 29, 2013 at 05:54:24PM +0200, Ruslan Bilovol wrote: CPU implementer : 0x41 CPU name: OMAP4470 ES1.0 HS Sigh. No. Look at what you're doing - look carefully at the above. "CPU implementer" - 0x41. That's A. For ARM Ltd. ARM Ltd implemented this CPU. Did ARM Ltd really implement OMAP4470 ? I think TI would be very upset if that were to be the case. Yes, it would be very surprisingly :) So no, OMAP4470 is _NOT_ a CPU. It is a SoC. The CPU inside the SoC is a collection of ARM Ltd Cortex A9 _CPUs_. See? Please, learn what a CPU is as opposed to a SoC. Completely agree with you. I will fix this Thank god you agreed to drop your current approach. Please elaborate what you are going to fix and also state what user-space features changes from OMAP4460 to OMAP4470. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH v2 0/2] ARM: update cpuinfo to print SoC model name
On Wednesday 30 January 2013 06:08 AM, Ruslan Bilovol wrote: The following patches update cpuinfo to print SoC model name for ARM. The first patch exactly makes needed changes for ARM architecture and adds a common approach to show SoC name. Second patch uses this approach for OMAP4 SoCs (as live example). Looks like there were few attempts to do similar changes to cpuinfo without any luck (had stuck on review) so this functionality is still not in the kernel yet. In this patch series the update to cpuinfo is very short (10 lines) and easy. Comments are welcome as usual As most of the people already commented, your purpose behind the series isn't clear so please give examples where and why you need device type information from user-space. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] arm: dts: omap4-sdp: Add I2c pinctrl data
On Wednesday 30 January 2013 02:13 PM, Kumar, Anil wrote: Hi Sourav, On Wed, Jan 30, 2013 at 12:10:18, Poddar, Sourav wrote: Hi Luciano, On Wednesday 30 January 2013 11:55 AM, Luciano Coelho wrote: Hi Sourav, On Mon, 2013-01-28 at 16:47 +0530, Sourav Poddar wrote: Booting 3.8-rc4 om omap 4430sdp results in the following error omap_i2c 4807.i2c: did not get pins for i2c error: -19 [1.024261] omap_i2c 4807.i2c: bus 0 rev0.12 at 100 kHz [1.030181] omap_i2c 48072000.i2c: did not get pins for i2c error: -19 [1.037384] omap_i2c 48072000.i2c: bus 1 rev0.12 at 400 kHz [1.043762] omap_i2c 4806.i2c: did not get pins for i2c error: -19 [1.050964] omap_i2c 4806.i2c: bus 2 rev0.12 at 100 kHz [1.056823] omap_i2c 4807a000.i2c: did not get pins for i2c error: -19 [1.064025] omap_i2c 4807a000.i2c: bus 3 rev0.12 at 400 kHz This happens because omap4 dts file is not adapted to use i2c through pinctrl framework. Populating i2c pinctrl data to get rid of the error. Tested on omap4430 sdp with 3.8-rc4 kernel. Signed-off-by: Sourav Poddar Reported-by: Santosh Shilimkar --- Could you do the same thing for panda? I'm getting the same kind of errors with it: omap4 uses pinctrl-single driver for pinmux with DT. Currently pinctrl-single driver is getting up after I2C driver. So I2c cannot use pinctrl. The below patch solve this issue http://www.gossamer-threads.com/lists/linux/kernel/1669067 Can you try with this ? it may solve it. OMAP i2c driver already takes care of -EPROBE_DEFER. The issue as you see from the log is not probe failure but missing the pin information in DT blob. And thats what patch does. Regards santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] cpufreq: Remove unnecessary use of policy->shared_type
On Friday 01 February 2013 12:10 PM, Viresh Kumar wrote: policy->shared_type field was added only for SoCs with ACPI support: commit 3b2d99429e3386b6e2ac949fc72486509c8bbe36 Author: Venkatesh Pallipadi Date: Wed Dec 14 15:05:00 2005 -0500 P-state software coordination for ACPI core http://bugzilla.kernel.org/show_bug.cgi?id=5737 Many non-ACPI systems are filling this field by mistake, which makes its usage confusing. Lets clean it. Signed-off-by: Viresh Kumar Cc: Linus Walleij Cc: Stephen Warren Cc: Shawn Guo Cc: Santosh Shilimkar --- I haven't looked at the cpufreq code recently but remember that it was needed to ensure that all the CPU which share clock/voltage gets updated (affected cpus) on freq change. The CPUs which needs SW co-ordination, should have this flag enabled and OMAP was falling in that category. May be I miss-understood its use, but can you confirm that SW co-ordination logic continues to work without this flag ? Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] cpufreq: Remove unnecessary use of policy->shared_type
On Friday 01 February 2013 12:43 PM, Viresh Kumar wrote: On 1 February 2013 12:17, Santosh Shilimkar wrote: I haven't looked at the cpufreq code recently but remember that it was needed to ensure that all the CPU which share clock/voltage gets updated (affected cpus) on freq change. The CPUs which needs SW co-ordination, should have this flag enabled and OMAP was falling in that category. Freq change are done by the target routines of platform cpufreq drivers and they do something like: for_each_cpu(freqs.cpu, policy->cpus) cpufreq_notify_transition(&freqs, CPUFREQ_PRECHANGE); The only requirement from cpufreq core is to keep cpus sharing clock in policy->cpus. I am not talking about just notifiers. This is for external users who has subscribed for notifiers. The point is whether the core CPUFReq gets updated without that flag for all affected CPU. May be I miss-understood its use, but can you confirm that SW co-ordination logic continues to work without this flag ? I believe it should work. It works for the systems i worked on: SPEAr13xx: Dual Cortex A9 ARM TC2: two clusters of A15s and A7s. I will give a try some time next week on OMAP. Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] cpufreq: Remove unnecessary use of policy->shared_type
On Friday 01 February 2013 01:32 PM, Viresh Kumar wrote: On 1 February 2013 13:03, Santosh Shilimkar wrote: I am not talking about just notifiers. This is for external users who has subscribed for notifiers. The point is whether the core CPUFReq gets updated without that flag for all affected CPU. Yes, its safe. Follow this thread, yesterday i explained this to Tomasz Figa: http://www.spinics.net/lists/arm-kernel/msg221629.html That part was very clear to me Viresh. Anyway thanks for the link. From what I read so far, it might just work but I would want to try it out before acking the approach. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] cpufreq: Remove unnecessary use of policy->shared_type
Viresh, On Friday 01 February 2013 02:22 PM, Santosh Shilimkar wrote: On Friday 01 February 2013 01:32 PM, Viresh Kumar wrote: On 1 February 2013 13:03, Santosh Shilimkar wrote: I am not talking about just notifiers. This is for external users who has subscribed for notifiers. The point is whether the core CPUFReq gets updated without that flag for all affected CPU. Yes, its safe. Follow this thread, yesterday i explained this to Tomasz Figa: http://www.spinics.net/lists/arm-kernel/msg221629.html That part was very clear to me Viresh. Anyway thanks for the link. From what I read so far, it might just work but I would want to try it out before acking the approach. You are correct. Sorry for oversight on your initial point about the usage of the flag. When I added that flag, I just went by the description thinking the cpufreq core booking and stat updates use the flag. Its not the case. Thanks for the fix. For the patch, Acked-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/3] mmc: omap_hsmmc: convert to dma_request_slave_channel_compat()
On Saturday 02 February 2013 02:31 AM, Matt Porter wrote: Convert dmaengine channel requests to use dma_request_slave_channel_compat(). This supports platforms booting with or without DT populated. Signed-off-by: Matt Porter Acked-by: Tony Lindgren --- Acked-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/3] mmc: omap_hsmmc: add generic DMA request support to the DT binding
On Saturday 02 February 2013 02:31 AM, Matt Porter wrote: The binding definition is based on the generic DMA request binding. Signed-off-by: Matt Porter Acked-by: Tony Lindgren --- Acked-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 00/14] DMA Engine support for AM33XX
On Friday 25 January 2013 02:19 AM, Matt Porter wrote: On Thu, Jan 24, 2013 at 05:12:05AM +, Shilimkar, Santosh wrote: On Thursday 24 January 2013 02:19 AM, Matt Porter wrote: On Wed, Jan 23, 2013 at 04:37:55PM +0530, Santosh Shilimkar wrote: Matt, On Wednesday 16 January 2013 02:02 AM, Matt Porter wrote: [...] Thanks for the catch, I'll add this in with your tested line as well for the series. BTW, since I'm dropping mmc support from this series, I'll move your patch into a separate series with only the mmc pieces. Fine by me -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: One of these things (CONFIG_HZ) is not like the others..
On Tuesday 22 January 2013 08:35 PM, Santosh Shilimkar wrote: On Tuesday 22 January 2013 08:21 PM, Russell King - ARM Linux wrote: On Tue, Jan 22, 2013 at 03:44:03PM +0530, Santosh Shilimkar wrote: Sorry for not being clear enough. On OMAP, 32KHz is the only clock which is always running(even during low power states) and hence the clock source and clock event have been clocked using 32KHz clock. As mentioned by RMK, with 32768 Hz clock and HZ = 100, there will be always an error of 0.1 %. This accuracy also impacts the timer tick interval. This was the reason, OMAP has been using the HZ = 128. Ok. Let's look at this. As far as time-of-day is concerned, this shouldn't really matter with the clocksource/clockevent based system that we now have (where *important point* platforms have been converted over.) Any platform providing a clocksource will override the jiffy-based clocksource. The measurement of time-of-day passing is now based on the difference in values read from the clocksource, not from the actual tick rate. Anything _not_ providing a clock source will be reliant on jiffies incrementing, which in turn _requires_ one timer interrupt per jiffies at a known rate (which is HZ). Now, that's the time of day, what about jiffies? Well, jiffies is incremented based on a certain number of nsec having passed since the last jiffy update. That means the code copes with dropped ticks and the like. However, if your actual interrupt rate is close to the desired HZ, then it can lead to some interesting effects (and noise): - if the interrupt rate is slightly faster than HZ, then you can end up with updates being delayed by 2x interrupt rate. - if the interrupt rate is slightly slower than HZ, you can occasionally end up with jiffies incrementing by two. - if your interrupt rate is dead on HZ, then other system noise can come into effect and you may get maybe zero, one or two jiffy increments per interrupt. (You have to think about time passing in NS, where jiffy updates should be vs where the timer interrupts happen.) See tick_do_update_jiffies64() for the details. The timer infrastructure is jiffy based - which includes scheduling where the scheduler does not use hrtimers. That means a slight discrepency between HZ and the actual interrupt rate can cause around 1/HZ jitter. That's a matter of fact due to how the code works. So, actually, I think the accuracy of HZ has much overall effect _provided_ a platform provides a clocksource to the accuracy of jiffy based timers nor timekeeping. For those which don't, the accuracy of the timer interrupt to HZ is very important. (This is just based on reading some code and not on practical experiments - I'd suggest some research of this is done, trying HZ=100 on OMAP's 32kHz timers, checking whether there's any drift, checking how accurately a single task can be woken from various select/poll/epoll delays, and checking whether NTP works.) Thanks for expanding it. It is really helpful. And I think further discussion is pointless until such research has been done (or someone who _really_ knows the time keeping/timer/sched code inside out comments.) Fully agree about experimentation to re-asses the drift. From what I recollect from past, few OMAP customers did report the time drift issue and that is how the switch from 100 --> 128 happened. Anyway I have added the suggested task to my long todo list. So I tried to see if any time drift with HZ = 100 on OMAP. I ran the setup for 62 hours and 27 mins with time synced up once with NTP server. I measure about ~174 millisecond drift which is almost noise considering the observed duration was ~22482 milliseconds. Am re-running the setup with HZ = 128 for similar time frame to see if the minimal drift observed goes away. Once through that, I will send a patch to update the OMAP to use HZ = 100 and possibly get rid of the custom OMAP HZ config. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: too many timer retries happen when do local timer swtich with broadcast timer
On Saturday 23 February 2013 12:22 AM, Thomas Gleixner wrote: On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote: On Fri, Feb 22, 2013 at 03:03:02PM +, Thomas Gleixner wrote: On Fri, 22 Feb 2013, Lorenzo Pieralisi wrote: On Fri, Feb 22, 2013 at 12:07:30PM +, Thomas Gleixner wrote: Now we could make use of that and avoid going deep idle just to come back right away via the IPI. Unfortunately the notification thingy has no return value, but we can fix that. To confirm that theory, could you please try the hack below and add some instrumentation (trace_printk)? Applied, and it looks like that's exactly why the warning triggers, at least on the platform I am testing on which is a dual-cluster ARM testchip. I too confirm that the warnings cause is same. There is a still time window though where the CPU (the IPI target) can get back to idle (tick_broadcast_pending still not set) before the CPU target of the broadcast has a chance to run tick_handle_oneshot_broadcast (and set tick_broadcast_pending), or am I missing something ? Well, the tick_broadcast_pending bit is uninteresting if the force_broadcast bit is set. Because if that bit is set we know for sure, that we got woken with the cpu which gets the broadcast timer and raced back to idle before the broadcast handler managed to send the IPI. Gah, my bad sorry, I mixed things up. I thought tick_check_broadcast_pending() was checking against the tick_broadcast_pending mask not tick_force_broadcast_mask Yep, that's a misnomer. I just wanted to make sure that my theory is correct. I need to think about the real solution some more. We have two alternatives: 1) Make the clockevents_notify function have a return value. 2) Add something like the hack I gave you with a proper name. The latter has the beauty, that we just need to modify the platform independent idle code instead of going down to every callsite of the clockevents_notify thing. I agree that 2 is better alternative to avoid multiple changes. Whichever alternative you choose, will be happy to test it :) Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/4] ARM: timer-sp: Set dynamic irq affinity
On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote: From: Viresh Kumar When a cpu goes to a deep idle state where its local timer is shutdown, it notifies the time frame work to use the broadcast timer instead. Unfortunately, the broadcast device could wake up any CPU, including an idle one which is not concerned by the wake up at all. Broad-cast device will only open the CPU for which the timer IRQ affined to. And infact with subject series the affinity also is updated for the CPU which owns the last timer expiry event. This implies, in the worst case, an idle CPU will wake up to send an IPI to another idle cpu. This patch fixes this for ARM platforms using timer-sp, by setting CLOCK_EVT_FEAT_DYNIRQ feature. Signed-off-by: Viresh Kumar Signed-off-by: Daniel Lezcano --- What am I missing here ? Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/4] ARM: timer-sp: Set dynamic irq affinity
On Wednesday 27 February 2013 10:29 AM, Viresh Kumar wrote: On 27 February 2013 10:26, Santosh Shilimkar wrote: On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote: From: Viresh Kumar When a cpu goes to a deep idle state where its local timer is shutdown, it notifies the time frame work to use the broadcast timer instead. Unfortunately, the broadcast device could wake up any CPU, including an idle one which is not concerned by the wake up at all. Broad-cast device will only open the CPU for which the timer IRQ affined to. And infact with subject series the affinity also is updated for the CPU which owns the last timer expiry event. What am I missing here ? Dynamic affinity will work only if the following flag is set for a clock_event_device: CLOCK_EVT_FEAT_DYNIRQ, otherwise wakeup would happen on the cpu to which static affinity was set to. I should have looked at the patches in order first :) Sorry for the noise. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/4] time : pass broadcast parameter
On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote: The broadcast timer could be passed as parameter to the function instead of using again tick_broadcast_device.evtdev which was previously used in the caller function. Signed-off-by: Daniel Lezcano --- The change doesn't buy us as such even after looking at next patch which tries to use bc. No strong opinion though. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] time : set broadcast irq affinity
On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote: When a cpu goes to a deep idle state where its local timer is shutdown, it notifies the time frame work to use the broadcast timer instead. Unfortunately, the broadcast device could wake up any CPU, including an idle one which is not concerned by the wake up at all. This implies, in the worst case, an idle CPU will wake up to send an IPI to another idle cpu. This patch solves this by setting the irq affinity to the cpu concerned by the nearest timer event, by this way, the CPU which is wake up is guarantee to be the one concerned by the next event and we are safe with unnecessary wakeup for another idle CPU. As the irq affinity is not supported by all the archs, a flag is needed to specify which clocksource can handle it. Minor. Can mention the flag name as well here "CLOCK_EVT_FEAT_DYNIRQ" Signed-off-by: Daniel Lezcano --- include/linux/clockchips.h |1 + kernel/time/tick-broadcast.c | 39 --- 2 files changed, 33 insertions(+), 7 deletions(-) diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index 6634652..c256cea 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -54,6 +54,7 @@ enum clock_event_nofitiers { */ #define CLOCK_EVT_FEAT_C3STOP 0x08 #define CLOCK_EVT_FEAT_DUMMY 0x10 +#define CLOCK_EVT_FEAT_DYNIRQ 0x20 Please add some comments about the usage of the flag. /** * struct clock_event_device - clock event device descriptor diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 6197ac0..1f7b4f4 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -406,13 +406,36 @@ struct cpumask *tick_get_broadcast_oneshot_mask(void) return to_cpumask(tick_broadcast_oneshot_mask); } -static int tick_broadcast_set_event(struct clock_event_device *bc, +/* + * Set broadcast interrupt affinity + */ +static void tick_broadcast_set_affinity(struct clock_event_device *bc, int cpu) +{ Better is just make second parameter as cpu_mask rather than CPU cpu number. Its a semantic of affinity hook which you can easily retain. + if (!(bc->features & CLOCK_EVT_FEAT_DYNIRQ)) + return; + + if (cpumask_equal(bc->cpumask, cpumask_of(cpu))) + return; + + bc->cpumask = cpumask_of(cpu); You can avoid the cpumask_of() couple of times above. + irq_set_affinity(bc->irq, bc->cpumask); +} + +static int tick_broadcast_set_event(struct clock_event_device *bc, int cpu, ktime_t expires, int force) { + int ret; + if (bc->mode != CLOCK_EVT_MODE_ONESHOT) clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); - return clockevents_program_event(bc, expires, force); + ret = clockevents_program_event(bc, expires, force); + if (ret) + return ret; + + tick_broadcast_set_affinity(bc, cpu); In case you go by cpumask paramater, then above can be just tick_broadcast_set_affinity(bc, cpumask_of(cpu)); + + return 0; } int tick_resume_broadcast_oneshot(struct clock_event_device *bc) @@ -441,7 +464,7 @@ static void tick_handle_oneshot_broadcast(struct clock_event_device *dev) { struct tick_device *td; ktime_t now, next_event; - int cpu; + int cpu, next_cpu; raw_spin_lock(&tick_broadcast_lock); again: @@ -454,8 +477,10 @@ again: td = &per_cpu(tick_cpu_device, cpu); if (td->evtdev->next_event.tv64 <= now.tv64) cpumask_set_cpu(cpu, to_cpumask(tmpmask)); - else if (td->evtdev->next_event.tv64 < next_event.tv64) + else if (td->evtdev->next_event.tv64 < next_event.tv64) { next_event.tv64 = td->evtdev->next_event.tv64; + next_cpu = cpu; + } } /* @@ -478,7 +503,7 @@ again: * Rearm the broadcast device. If event expired, * repeat the above */ - if (tick_broadcast_set_event(dev, next_event, 0)) + if (tick_broadcast_set_event(dev, next_cpu, next_event, 0)) goto again; } raw_spin_unlock(&tick_broadcast_lock); @@ -521,7 +546,7 @@ void tick_broadcast_oneshot_control(unsigned long reason) cpumask_set_cpu(cpu, tick_get_broadcast_oneshot_mask()); clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN); if (dev->next_event.tv64 < bc->next_event.tv64) - tick_broadcast_set_event(bc, dev->next_event, 1); + tick_broadcast_set_event(bc, cpu, dev->next_event, 1); Since you have embedded the irq_affinity() in above function, the IRQ affinity for bc->irq will remain to the last CPU on which the inte
Re: [PATCH 0/4] time: dynamic irq affinity
On Wednesday 27 February 2013 03:47 AM, Daniel Lezcano wrote: When a cpu goes to a deep idle state where its local timer is shutdown, it notifies the time framework to use the broadcast timer instead. Unfortunately, the broadcast device could wake up any CPU, including an idle one which is not concerned by the wake up at all. This implies, in the worst case, an idle CPU will wake up to send an IPI to another idle cpu. This patch solves this by setting the irq affinity to the cpu concerned by the nearest timer event, by this way, the CPU which is wake up is guarantee to be the one concerned by the next event and we are safe with unnecessary wakeup for another idle CPU. As the irq affinity is not supported by all the archs, a flag is needed to specify which clocksource can handle it. Not completely related to this series but there is another issue where this local timer not wakeup capable hurts. So far we are discussing only the timer related future events which are known and can be programmed with broadcast device. But think of the scenario's where we need to send asynchronous IPIs to CPUs to do some work. e.g generic_exec_single(). If the CPU which is suppose to be available after IPI call is in deep low power state, then the IPI(implemented on ARM) isn't effective. In CPU off idle modes, a GIC SGI will not wake the CPU and hence a special wakeup is needed to bring out those CPUs out of idle. This special wakeup is handled by broad-cast timer in case of CPUIDLE. In short what I mean is, you need to have IPI which can wakeup CPUs from any deep idle power state to address above. Has anybody thought of this one ? Regards, Santosh P.S: Time and again it proves that making the local timer wakeup capable solves the issue. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 0/8] ARM: AM43 (OMAP2+) boot support
On Monday 18 February 2013 05:05 PM, Afzal Mohammed wrote: (Resending, since it seems, LAKML doesn't accept patches with subject prefix only as "RFC", but requires "PATCH" prefix also) Hi, This series adds minimal support to boot Linux on platforms having AM43 based SoC's. This is being sent as an RFC to seek opinion about modification in twd to register percpu local timer clock event for scheduler tick in the case of one core SMP. AM43 SoC's are based on ARM Cortex-A9. It is an ARM Cortex-A9 SMP configuration with one core (not uniprocessor configuration). AM43 is similar to AM335x in it's peripheral capabilities, with many of the peripheral register mapping's similar like that of uart. After looking at the specs, you don't need the SMP mode since ACP isn't being used. AM43 is in pre-silicon stage and currently there are no public documents. This series has been tested on a pre-silicon platform that emulates AM43 SoC, changes proposed here are minimal - to get it booting. Kernel was directly run without the help of bootloader - Images were directly loaded onto a pre-initialized RAM and ARM registers updated as required for booting. Changes have been made over linux-next (next-20130213) with three "OF" related reverts (which otherwise causes problem in other platforms also) and compiled with omap2plus_defconfig. Multiplatform option was enabled, while most of CONFIG options were deselected for a faster boot. Beagle bone boots as earlier with these changes. An interesting observation is that it may be possible to boot this platform to console without any platform specific modification to proper Kernel (by that I mean excluding DT sources) using Arnd's, "[PATCH,RFC] default machine descriptor for multiplatform", along with a "CLOCKSOURCE_OF_DECLARE" for smp twd. TWD use for AM437x is also limited because these times stops in low power sates and there you will need broad-cast mechanism which again more of SMP machine feature. So I suggest to use the wakeup timer(GPT1) clock-event instead of local timer for the mentioned reason. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 2/8] ARM: twd: register clock event for 1 core SMP
On Monday 18 February 2013 05:07 PM, Afzal Mohammed wrote: Register percpu local timer for scheduler tick in the case of one core SMP configuration. In other cases - secondary cpu's as well as boot cpu's having more than one core, this is being registered as per existing boot flow, with a difference that they happens after delay calibration. Registering the clock for tick in case of one core should be done before Kernel calibrates delay (this is required to boot, unless local timer is the only one registered for tick). Registering twd local timer at init_time (which platforms are doing now) helps achieve that with the proposed change. This helps in an almost booting Kernel (minimal) by only relying on ARM parts for an A9 one core SMP. Signed-off-by: Afzal Mohammed --- As mentioned in cover-letter, I don't think we have good reasoning to make TWD to work with UP configuration. Even you fix the timer code, there are more cascaded dependencies which is not worth the effort. Regards, Samtosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 3/8] ARM: twd: clock rate from DT (if no DT clk tree)
On Monday 18 February 2013 05:07 PM, Afzal Mohammed wrote: Add an optional property to find clock-frequency from DT. This helps as a fallback mechanism in case there is no representation of clock tree in DT. Signed-off-by: Afzal Mohammed --- This won't work always because twd clock is CPU dependent and changes with CPU clock speed. And more importantly to get CPUFreq working, you need to provide the correct clock-node to TWD library. Refer OMAP4 clock data for reference. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 1/8] ARM: localtimer: return percpu clkevt on register
On Monday 18 February 2013 05:06 PM, Afzal Mohammed wrote: Return percpu clock event on local timer register. It is the boot cpu that calls this and it can use the returned percpu clock event to register a clock event in the case of SMP configuration with one core. SMP configuration with 1 core is UP :-) Jokes apart as said already, lets see whether we really need it. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 4/8] ARM: am33xx: ll debug config help
On Monday 18 February 2013 05:07 PM, Afzal Mohammed wrote: Selecting DEBUG_AM33XXUART1 routes low level debug messages to first UART instance of AM335x based SoC's. This selection is valid for upcoming AM43 based SoC's too. Make this information available upon configuring. Signed-off-by: Afzal Mohammed --- arch/arm/Kconfig.debug | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug index aca..b717b78 100644 --- a/arch/arm/Kconfig.debug +++ b/arch/arm/Kconfig.debug @@ -542,6 +542,9 @@ choice config DEBUG_AM33XXUART1 bool "AM33XX UART1" + help + Route low level debug messages to first uart instance + for boards based on am335 and am43 family of SoC's config DEBUG_ZOOM_UART bool "Zoom2/3 UART" With DT, IIRC DEBUGLL is broken. So did you hack debug-macro.S to get the earlyprintk working ? Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 5/8] ARM: OMAP2+: am43: Kconfig
On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote: Add Kconfig option for AM43 family of SoC's, these are ARM Cortex A9 based (SMP configuration with 1 core). Signed-off-by: Afzal Mohammed --- arch/arm/mach-omap2/Kconfig | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig index 49ac3df..683fbaa 100644 --- a/arch/arm/mach-omap2/Kconfig +++ b/arch/arm/mach-omap2/Kconfig @@ -141,6 +141,17 @@ config SOC_AM33XX select MULTI_IRQ_HANDLER select COMMON_CLK +config SOC_AM43 + bool "TI AM43" + depends on ARCH_OMAP2PLUS + default y + select CPU_V7 + select HAVE_SMP You don't need this + select LOCAL_TIMERS if SMP This one as well + select MULTI_IRQ_HANDLER And this one too I guess. Rest is fine. regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 6/8] ARM: OMAP2+: am43: basic dt support
On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote: Describe minimal DT boot machine details for AM43 based SoC's. AM43 family of SoC's are ARM Cortex-A9 based with one core in SMP configuration. Low level debug could be achieved by selecting DEBUG_AM33XXUART1. To boot AM43 SoC, this change is sufficient w.r.t Kernel (considering the fact that strictly speaking DT sources does not classify as a part of Kernel). Here private timer of the one and only core is being used as clock event (as well as, as time source). Signed-off-by: Afzal Mohammed --- As discussed already, lets just call this as Cortex-A9 single core machine to avoid confusion. arch/arm/mach-omap2/board-generic.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/arch/arm/mach-omap2/board-generic.c b/arch/arm/mach-omap2/board-generic.c index 0274ff7..e083f08 100644 --- a/arch/arm/mach-omap2/board-generic.c +++ b/arch/arm/mach-omap2/board-generic.c @@ -15,7 +15,10 @@ #include #include #include +#include +#include +#include #include #include "common.h" @@ -182,3 +185,18 @@ DT_MACHINE_START(OMAP5_DT, "Generic OMAP5 (Flattened Device Tree)") .restart= omap44xx_restart, MACHINE_END #endif + +#ifdef CONFIG_SOC_AM43 +static const char *am43_boards_compat[] __initdata = { + "ti,am43", + NULL, +}; + +DT_MACHINE_START(AM43_DT, "Generic AM43 (Flattened Device Tree)") + .map_io = debug_ll_io_init, + .init_irq = irqchip_init, With Arnds patch [1], the above can be dropped.. + .init_machine = omap_generic_init, + .init_time = twd_local_timer_of_register, This one needs to take care of other timers as well from start. So better to take that approach from start. Regards, Santosh [1] https://patchwork.kernel.org/patch/2074871/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 8/8] ARM: dts: am43-pre-silicon support
On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote: AM43 SoC is in pre-silicon stage, meanwhile it has been modelled in a pre-silicon platform. To validate and boot Linux in pre-silicon platform that emulates an AM43 SoC, add DT build support. As bootloader is not used, bootargs is passed through DT. Note: This would be replaced by an original board support. Signed-off-by: Afzal Mohammed --- arch/arm/boot/dts/Makefile | 3 ++- arch/arm/boot/dts/am43-pre-silicon.dts | 31 +++ 2 files changed, 33 insertions(+), 1 deletion(-) create mode 100644 arch/arm/boot/dts/am43-pre-silicon.dts diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile index 94d88b9..b434344 100644 --- a/arch/arm/boot/dts/Makefile +++ b/arch/arm/boot/dts/Makefile @@ -124,7 +124,8 @@ dtb-$(CONFIG_ARCH_OMAP2PLUS) += omap2420-h4.dtb \ omap5-evm.dtb \ am335x-evm.dtb \ am335x-evmsk.dtb \ - am335x-bone.dtb + am335x-bone.dtb \ + am43-pre-silicon.dtb dtb-$(CONFIG_ARCH_ORION5X) += orion5x-lacie-ethernet-disk-mini-v2.dtb dtb-$(CONFIG_ARCH_PRIMA2) += prima2-evb.dtb dtb-$(CONFIG_ARCH_U8500) += snowball.dtb \ diff --git a/arch/arm/boot/dts/am43-pre-silicon.dts b/arch/arm/boot/dts/am43-pre-silicon.dts new file mode 100644 index 000..b9c6297 --- /dev/null +++ b/arch/arm/boot/dts/am43-pre-silicon.dts Well the pre-silicon platform and the SOC are very close and at least the support you are adding here is exactly same. So lets just use am437x.dtb or something like that. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 8/8] ARM: dts: am43-pre-silicon support
On Tuesday 19 February 2013 04:22 PM, Mohammed, Afzal wrote: Hi Santosh, On Tue, Feb 19, 2013 at 16:05:22, Shilimkar, Santosh wrote: On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote: AM43 SoC is in pre-silicon stage, meanwhile it has been modelled in a pre-silicon platform. To validate and boot Linux in pre-silicon platform that emulates an AM43 SoC, add DT build support. Note: This would be replaced by an original board support. - am335x-bone.dtb + am335x-bone.dtb \ + am43-pre-silicon.dtb Well the pre-silicon platform and the SOC are very close and at least the support you are adding here is exactly same. So lets just use am437x.dtb or something like that. SoC support is already added in patch 7/8. This is board (which doesn't exist now) support, hence a pre-silicon temporary one to validate it. I mean we can call it am437x-xyxboard.dtb already considering the data here can be re-used. Boot-args can be used from default kernel config with CONFIG_CMDLINE_FORCE. No strong opinion if you still insist to have a pre-silicon dtb. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 5/8] ARM: OMAP2+: am43: Kconfig
On Tuesday 19 February 2013 04:26 PM, Felipe Balbi wrote: On Tue, Feb 19, 2013 at 03:57:07PM +0530, Santosh Shilimkar wrote: On Monday 18 February 2013 05:08 PM, Afzal Mohammed wrote: Add Kconfig option for AM43 family of SoC's, these are ARM Cortex A9 based (SMP configuration with 1 core). Signed-off-by: Afzal Mohammed --- arch/arm/mach-omap2/Kconfig | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig index 49ac3df..683fbaa 100644 --- a/arch/arm/mach-omap2/Kconfig +++ b/arch/arm/mach-omap2/Kconfig @@ -141,6 +141,17 @@ config SOC_AM33XX select MULTI_IRQ_HANDLER select COMMON_CLK +config SOC_AM43 + bool "TI AM43" + depends on ARCH_OMAP2PLUS + default y + select CPU_V7 + select HAVE_SMP You don't need this actually, this is needed for CONFIG_SMP_ON_UP Ahh.. I missed that. Thanks Felipe for pointing it out. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 4/8] ARM: am33xx: ll debug config help
On Tuesday 19 February 2013 04:00 PM, Mohammed, Afzal wrote: Hi Santosh, On Tue, Feb 19, 2013 at 15:55:59, Shilimkar, Santosh wrote: With DT, IIRC DEBUGLL is broken. So did you hack debug-macro.S to get the earlyprintk working ? No, on linux-next, ll debug works properly. Indeed. Tony fixed that now. Some how I missed this patch on the list. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 8/8] ARM: dts: am43-pre-silicon support
On Tuesday 19 February 2013 04:33 PM, Mohammed, Afzal wrote: Hi Santosh, On Tue, Feb 19, 2013 at 16:30:13, Shilimkar, Santosh wrote: On Tuesday 19 February 2013 04:22 PM, Mohammed, Afzal wrote: SoC support is already added in patch 7/8. This is board (which doesn't exist now) support, hence a pre-silicon temporary one to validate it. I mean we can call it am437x-xyxboard.dtb already considering the data here can be re-used. Boot-args can be used from default kernel config with CONFIG_CMDLINE_FORCE. No strong opinion if you still insist to have a pre-silicon dtb. This patch would be replaced by original board, once it is known. This was included to make a working complete series and if someone wants (internally) to test the series as is. Ohh. I assumed it is for merge as well. Thanks for clarification. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH, RFC 2/8] ARM: twd: register clock event for 1 core SMP
On Tuesday 19 February 2013 05:44 PM, Felipe Balbi wrote: On Tue, Feb 19, 2013 at 03:44:14PM +0530, Santosh Shilimkar wrote: On Monday 18 February 2013 05:07 PM, Afzal Mohammed wrote: Register percpu local timer for scheduler tick in the case of one core SMP configuration. In other cases - secondary cpu's as well as boot cpu's having more than one core, this is being registered as per existing boot flow, with a difference that they happens after delay calibration. Registering the clock for tick in case of one core should be done before Kernel calibrates delay (this is required to boot, unless local timer is the only one registered for tick). Registering twd local timer at init_time (which platforms are doing now) helps achieve that with the proposed change. This helps in an almost booting Kernel (minimal) by only relying on ARM parts for an A9 one core SMP. Signed-off-by: Afzal Mohammed --- As mentioned in cover-letter, I don't think we have good reasoning to make TWD to work with UP configuration. Even you fix the timer code, there are more cascaded dependencies which is not worth the effort. if CONFIG_SMP_ON_UP is enabled, smp_twd.c can still be compiled, right ? Yep though just from deps pesrpective TWD is made available for ARM SMP machines as below config HAVE_ARM_TWD bool depends on SMP Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [resend] Timer broadcast question
On Tuesday 19 February 2013 11:51 PM, Daniel Lezcano wrote: On 02/19/2013 07:10 PM, Thomas Gleixner wrote: On Tue, 19 Feb 2013, Daniel Lezcano wrote: I am working on identifying the different wakeup sources from the interrupts and I have a question regarding the timer broadcast. The broadcast timer is setup to the next event and that will wake up any idle cpu belonging to the "broadcast cpumask", right ? The cpu which has been woken up will look for each cpu the next-event and send an IPI to wake it up. Although, it is possible the sender of this IPI may not be concerned by the timer expiration and has been woken up just for sending the IPI, right ? Correct. If this is correct, is it possible to setup the timer irq affinity to a cpu which will be concerned by the timer expiration ? so we prevent an unnecessary wake up for a cpu. It is possible, but we never implemented it. If we go there, we want to make that conditional on a property flag, because some interrupt controllers especially on x86 only allow to move the affinity from interrupt context, which is pointless. Thanks Thomas for your quick answer. I will write a RFC patchset. Last year I implemented the affinity hook for broad-cast code and experimented with it. Since the system I was using was dual core, it wasn't much beneficial and hence gave up later. I did remember discussing the approach with few folks in the conference. Patch in the end of the email (also attached) for generic broadcast code. I didn't look at all corner case though. In arch code then you need to setup "broadcast_affinity" hook which should be able to get handle of the arch irqchip and call the respective affinity handler. Just 3 lines function should do the trick. As Thomas said, effectiveness of such optimization solely depends on how well the affinity (in low powers) supported by your IRQ chip. Hope this is helpful for you. Regards, Santosh From d70f2d48ec08a3f1d73187c49b16e4e60f81a50c Mon Sep 17 00:00:00 2001 From: Santosh Shilimkar Date: Wed, 25 Jul 2012 03:42:33 +0530 Subject: [PATCH] tick-broadcast: Add tick road-cast affinity suport Current tick broad-cast code has affinity set to the boot CPU and hence the boot CPU will always wakeup from low power states when broad cast timer is armed even if the next expiry event doesn't belong to it. Patch adds broadcast affinity functionality to avoid above and let the tick framework set the affinity of the event for the CPU it belongs. Signed-off-by: Santosh Shilimkar --- include/linux/clockchips.h |2 ++ kernel/time/tick-broadcast.c | 13 - 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index 8a7096f..5488cdc 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -95,6 +95,8 @@ struct clock_event_device { unsigned long retries; void(*broadcast)(const struct cpumask *mask); + void(*broadcast_affinity) + (const struct cpumask *mask, int irq); void(*set_mode)(enum clock_event_mode mode, struct clock_event_device *); void(*suspend)(struct clock_event_device *); diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index f113755..2ec2425 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -39,6 +39,8 @@ static void tick_broadcast_clear_oneshot(int cpu); static inline void tick_broadcast_clear_oneshot(int cpu) { } #endif +static inline void dummy_broadcast_affinity(const struct cpumask *mask, + int irq) { } /* * Debugging: see timer_list.c */ @@ -485,14 +487,19 @@ void tick_broadcast_oneshot_control(unsigned long reason) if (!cpumask_test_cpu(cpu, tick_get_broadcast_oneshot_mask())) { cpumask_set_cpu(cpu, tick_get_broadcast_oneshot_mask()); clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN); - if (dev->next_event.tv64 < bc->next_event.tv64) + if (dev->next_event.tv64 < bc->next_event.tv64) { tick_broadcast_set_event(dev->next_event, 1); + bc->broadcast_affinity( + tick_get_broadcast_oneshot_mask(), bc->irq); + } } } else { if (cpumask_test_cpu(cpu, tick_get_broadcast_oneshot_mask())) { cpumask_clear_cpu(cpu, tick_get_broadcast_oneshot_mask()); clockevents_set_mode(dev, CLOCK_EVT_MODE_ONESHOT); + bc->broadcast_affinity( +
Re: [resend] Timer broadcast question
On Thursday 21 February 2013 02:31 PM, Daniel Lezcano wrote: On 02/21/2013 07:19 AM, Santosh Shilimkar wrote: On Tuesday 19 February 2013 11:51 PM, Daniel Lezcano wrote: On 02/19/2013 07:10 PM, Thomas Gleixner wrote: On Tue, 19 Feb 2013, Daniel Lezcano wrote: I am working on identifying the different wakeup sources from the interrupts and I have a question regarding the timer broadcast. The broadcast timer is setup to the next event and that will wake up any idle cpu belonging to the "broadcast cpumask", right ? The cpu which has been woken up will look for each cpu the next-event and send an IPI to wake it up. Although, it is possible the sender of this IPI may not be concerned by the timer expiration and has been woken up just for sending the IPI, right ? Correct. If this is correct, is it possible to setup the timer irq affinity to a cpu which will be concerned by the timer expiration ? so we prevent an unnecessary wake up for a cpu. It is possible, but we never implemented it. If we go there, we want to make that conditional on a property flag, because some interrupt controllers especially on x86 only allow to move the affinity from interrupt context, which is pointless. Thanks Thomas for your quick answer. I will write a RFC patchset. Last year I implemented the affinity hook for broad-cast code and experimented with it. Since the system I was using was dual core, it wasn't much beneficial and hence gave up later. I did remember discussing the approach with few folks in the conference. I did a brief test with a similar patch on a ARM u8500 board. The timer is tied with CPU0 by default, setting the dynamic irq affinity reduce considerably the number of IPI. The difference with your patch is the affinity is set to one CPU, the first one which is supposed to be wake up by the timer expiration. This is easy to spot with a small program doing usleep wired on CPU1. We see CPU0 waking up to send an IPI to CPU1 and going to idle again. I don't know how that behaves with OMAP4 with this patch (which I guess it is the board you used), but the coupled idle state traces could be ambiguous if you relied on it to check the benefit of this patch. Across OMAP4 and OMAP5 based devices, only the general purpose OMAP5 devices the approach was useful. Rest of the devices had constraints of master CPU(CPU0) waking up first always which in turns means pining the affinity to that CPU always which the current code already does. That was also another reason I didn't persue it further. IMO, it is worth to implement such solution and perhaps we can extend it to optimize the package idle time with the generic power domain tied with the irq. Anyway, it is a random thought let's see that later :) It is surely a good optimization especially for multi-core CPUIdle. Patch in the end of the email (also attached) for generic broadcast code. I didn't look at all corner case though. In arch code then you need to setup "broadcast_affinity" hook which should be able to get handle of the arch irqchip and call the respective affinity handler. Just 3 lines function should do the trick. As Thomas said, effectiveness of such optimization solely depends on how well the affinity (in low powers) supported by your IRQ chip. Hope this is helpful for you. Thanks a lot for your patch and your feedbacks. Am glad that it was helpful. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: too many timer retries happen when do local timer swtich with broadcast timer
Thomas, On Thursday 21 February 2013 07:18 PM, Thomas Gleixner wrote: Jason, On Thu, 21 Feb 2013, Jason Liu wrote: 2013/2/21 Thomas Gleixner : Now your explanation makes sense. I have no fast solution for this, but I think that I have an idea how to fix it. Stay tuned. Thanks Thomas, wait for your fix. :) find below a completely untested patch, which should address that issue. After looking at the thread, I tried to see the issue on OMAP and could see the same issue as Jason. Your patch fixes the retries on both CPUs on my dual core machine. So you use my tested by if you need one. Tested-by: Santosh Shilimkar Thanks for the patch. And thanks to Jason for spotting the issue. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: too many timer retries happen when do local timer swtich with broadcast timer
Thomas, On Friday 22 February 2013 03:49 AM, Thomas Gleixner wrote: On Thu, 21 Feb 2013, Santosh Shilimkar wrote: On Thursday 21 February 2013 07:18 PM, Thomas Gleixner wrote: find below a completely untested patch, which should address that issue. After looking at the thread, I tried to see the issue on OMAP and could see the same issue as Jason. That's interesting. We have the same issue on x86 since 2007 and nobody noticed ever. It's basically the same problem there, but it seems that on x86 getting out of those low power states is way slower than the minimal reprogramming delta which is used to enforce the local timer to fire after the wakeup. I'm still amazed that as Jason stated a 1us reprogramming delta is sufficient to get this ping-pong going. I somehow doubt that, but maybe ARM is really that fast :) Your patch fixes the retries on both CPUs on my dual core machine. So you use my tested by if you need one. They are always welcome. BTW, Lorenzo off-list mentioned to me about warning in boot-up which I missed while testing your patch. It will take bit more time for me to look into it and hence thought of reporting it. [2.186126] [ cut here ] [2.190979] WARNING: at kernel/time/tick-broadcast.c:501 tick_broadcast_oneshot_control+0x1c0/0x21c() [2.200622] Modules linked in: [2.203826] [] (unwind_backtrace+0x0/0xf0) from [] (warn_slowpath_common+0x4c/0x64) [2.213653] [] (warn_slowpath_common+0x4c/0x64) from [] (warn_slowpath_null+0x1c/0x24) [2.223754] [] (warn_slowpath_null+0x1c/0x24) from [] (tick_broadcast_oneshot_control+0x1c0/0x21c) [2.234924] [] (tick_broadcast_oneshot_control+0x1c0/0x21c) from [] (tick_notify+0x23c/0x42c) [2.245666] [] (tick_notify+0x23c/0x42c) from [] (notifier_call_chain+0x44/0x84) [2.255218] [] (notifier_call_chain+0x44/0x84) from [] (raw_notifier_call_chain+0x18/0x20) [2.265686] [] (raw_notifier_call_chain+0x18/0x20) from [] (clockevents_notify+0x2c/0x174) [2.276123] [] (clockevents_notify+0x2c/0x174) from [] (omap_enter_idle_smp+0x3c/0x120) [2.286315] [] (omap_enter_idle_smp+0x3c/0x120) from [] (cpuidle_enter+0x14/0x18) [2.295928] [] (cpuidle_enter+0x14/0x18) from [] (cpuidle_wrap_enter+0x34/0xa0) [2.305389] [] (cpuidle_wrap_enter+0x34/0xa0) from [] (cpuidle_idle_call+0xe0/0x328) [2.315307] [] (cpuidle_idle_call+0xe0/0x328) from [] (cpu_idle+0x8c/0x11c) [2.324401] [] (cpu_idle+0x8c/0x11c) from [] (start_kernel+0x2b0/0x300) [2.333129] ---[ end trace 6fe1f7b4606a9e20 ]--- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: too many timer retries happen when do local timer swtich with broadcast timer
On Friday 22 February 2013 03:54 PM, Thomas Gleixner wrote: On Fri, 22 Feb 2013, Santosh Shilimkar wrote: BTW, Lorenzo off-list mentioned to me about warning in boot-up which I missed while testing your patch. It will take bit more time for me to look into it and hence thought of reporting it. [2.186126] [ cut here ] [2.190979] WARNING: at kernel/time/tick-broadcast.c:501 tick_broadcast_oneshot_control+0x1c0/0x21c() Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ? The force broadcast mask one coming from below. "WARN_ON_ONCE(test_bit(cpu, tick_force_broadcast_mask));" Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: too many timer retries happen when do local timer swtich with broadcast timer
On Friday 22 February 2013 04:01 PM, Lorenzo Pieralisi wrote: On Fri, Feb 22, 2013 at 10:24:00AM +, Thomas Gleixner wrote: On Fri, 22 Feb 2013, Santosh Shilimkar wrote: BTW, Lorenzo off-list mentioned to me about warning in boot-up which I missed while testing your patch. It will take bit more time for me to look into it and hence thought of reporting it. [2.186126] [ cut here ] [2.190979] WARNING: at kernel/time/tick-broadcast.c:501 tick_broadcast_oneshot_control+0x1c0/0x21c() Which one is that? tick_broadcast_pending or tick_force_broadcast_mask ? It is the tick_force_broadcast_mask and I think that's because on all systems we are testing, the broadcast timer IRQ is a thundering herd, all CPUs get out of idle at once and try to get out of broadcast mode at more or less the same time. So the issue comes ups only when the idle state used where CPU wakeup more or less at same time as Lorenzo mentioned. I have two platforms where I could test the patch and see the issue only with one platform. Yesterday I didn't notice the warning because it wasn't seen on that platform :-) OMAP4 idle entry and exit in deep state is staggered between CPUs and hence the warning isn't seen. On OMAP5 though, there is an additional C-state where idle entry/exit for CPU isn't staggered and I see the issue in that case. Actually the broad-cast code doesn't expect such a behavior from CPUs since only the broad-cast affine CPU should wake up and rest of the CPU should be woken up by the broad-cast IPIs. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] cpufreq: cpufreq-cpu0: provide compatibility string for DT matchup
On Tuesday 12 March 2013 07:58 PM, Benoit Cousson wrote: > On 03/12/2013 06:07 AM, Santosh Shilimkar wrote: >> On Tuesday 12 March 2013 04:35 AM, Nishanth Menon wrote: >>> commit 5553f9e (cpufreq: instantiate cpufreq-cpu0 as a platform_driver) >>> now forces platform device to be registered for allowing cpufreq-cpu0 >>> to be used by SoCs. example: drivers/cpufreq/highbank-cpufreq.c >>> >>> However, for SoCs that wish to link up using device tree, instead >>> of platform device, provide compatibility string match: >>> compatible = "cpufreq,cpu0"; > > You cannot add a non-HW relative binding... DT is supposed to represent > the pure HW. > AFAIK, cpufreq has nothing to do with the HW definition. > You are right. >>> >>> Cc: "Rafael J. Wysocki" >>> Cc: Santosh Shilimkar >>> Cc: Shawn Guo >>> Cc: linux-kernel@vger.kernel.org >>> Cc: cpuf...@vger.kernel.org >>> Cc: linux...@vger.kernel.org >>> Cc: linux-o...@vger.kernel.org >>> >>> Signed-off-by: Nishanth Menon >>> --- >>> .../devicetree/bindings/cpufreq/cpufreq-cpu0.txt |3 +++ >>> drivers/cpufreq/cpufreq-cpu0.c |6 ++ >>> 2 files changed, 9 insertions(+) >>> >> Looks fine to me. CC'ing dt list in case some one has >> comments on binding updates. >> >> Acked-by: Santosh Shilimkar > > Not-Acked-by-me. > I obviously missed the point while acking the patch. Regards, santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 0/7] tick: Optimize broadcast handling and prevent expiry ping pong
Thomas, On Wednesday 06 March 2013 04:48 PM, Thomas Gleixner wrote: > Jason decoded a problem related to the broadcast timer mode. The > reprogramming of the cpu local timer causes a huge number of > retries. Also there is a situation where the CPU which does not handle > the broadcast timer interrupt exits and reenters broadcast mode before > the broadcast interrupt got handled by another CPU. This can lead to > an interesting ping pong of the broadcast and the cpu local timer > code. > > This series addresses these problems. The first two patches convert > the broadcast code to proper cpumask_var_t instead of adding more > bitmaps later. > > The rest of the series is adopted from the quick patches which I > posted earlier while discussing the issue with Jason et. al. > > Please give it a proper testing on your affected hardware. > I have tested this revised patches on OMAP4 and OMAP5 platforms with CPUidle enabled against 3.9-rc2. As expected they seems to work without any issue and also fixes the reported retry issue. Tested-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: hw_breakpoint: Enable debug powerdown only if system supports 'has_ossr'
Will, On Wednesday 13 March 2013 05:59 PM, Lokesh Vutla wrote: > Hi Dietmar, > On Wednesday 13 March 2013 05:35 PM, Dietmar Eggemann wrote: >> On 13/03/13 06:52, Lokesh Vutla wrote: >>> Commit {9a6eb31 ARM: hw_breakpoint: Debug powerdown support for >>> self-hosted >>> debug} introduces debug powerdown support for self-hosted debug. >>> While merging the patch 'has_ossr' check was removed which >>> was needed for hardwares which doesn't support self-hosted debug. >>> Pandaboard (A9) is one such hardware and Dietmar's orginial >>> patch did mention this issue. >>> Without that check on Panda with CPUIDLE enabled, a flood of >>> below messages thrown. >>> >>> [ 3.597930] hw-breakpoint: CPU 0 failed to disable vector catch >>> [ 3.597991] hw-breakpoint: CPU 1 failed to disable vector catch >>> >> >> Hi Lokesh, >> >> I confirm that this has_ossr condition has to go back into the >> pm_init(void) call. I just verified it again on my Panda board and I get >> the same issue like you without it. >> >> I guess what was happening is that Will asked me if this check is really >> necessary and I said No without re-testing on my Panda board as an V7 >> debug architecture example where OSSR is not implemented. Since then I >> only tested in on V7.1 debug architectures where OSSR is mandatory. >> Sorry about this and thanks for catching this. > Thanks for confirming..:) > Can you please queue this one for 3.9-rc2+ ? Without this patch CPUIDLE is unusable on OMAP4 devices because of those flood of warning messages. I was also wondering whether we should just warn once rather than continuous warnings in the notifier. Patch is end of the email. Regards, Santosh >From b8db63f786719aef835f1ef4e20f3b3406b99b62 Mon Sep 17 00:00:00 2001 From: Santosh Shilimkar Date: Thu, 14 Mar 2013 13:03:25 +0530 Subject: [PATCH] ARM: hw_breakpoints: Use warn_once to avoid message flood on unsupported ossr machines Signed-off-by: Santosh Shilimkar --- arch/arm/kernel/hw_breakpoint.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c index 96093b7..5dc1aa6 100644 --- a/arch/arm/kernel/hw_breakpoint.c +++ b/arch/arm/kernel/hw_breakpoint.c @@ -966,7 +966,7 @@ static void reset_ctrl_regs(void *unused) } if (err) { - pr_warning("CPU %d debug is powered down!\n", cpu); + pr_warn_once("CPU %d debug is powered down!\n", cpu); cpumask_or(&debug_err_mask, &debug_err_mask, cpumask_of(cpu)); return; } @@ -987,7 +987,7 @@ clear_vcr: isb(); if (cpumask_intersects(&debug_err_mask, cpumask_of(cpu))) { - pr_warning("CPU %d failed to disable vector catch\n", cpu); + pr_warn_once("CPU %d failed to disable vector catch\n", cpu); return; } @@ -1007,7 +1007,7 @@ clear_vcr: } if (cpumask_intersects(&debug_err_mask, cpumask_of(cpu))) { - pr_warning("CPU %d failed to clear debug register pairs\n", cpu); + pr_warn_once("CPU %d failed to clear debug register pairs\n", cpu); return; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] drivers: bus: Move the OMAP interconnect driver to drivers/bus/
OMAP interconnect drivers are used for the interconnect error handling. Since they are bus driver, lets move it to newly created drivers/bus. Cc: Arnd Bergmann Cc: Tony Lindgren Tested-by: Lokesh Vutla Signed-off-by: Santosh Shilimkar --- Patch just moves OMAP interconnect drivers as is to the newly created driver/bus/* directory. Patch is generated against "arm-soc/drivers/ocp2scp" tree and test on all OMAP boards. arch/arm/mach-omap2/Kconfig|2 ++ arch/arm/mach-omap2/Makefile |5 - drivers/bus/Kconfig|6 ++ drivers/bus/Makefile |3 +++ {arch/arm/mach-omap2 => drivers/bus}/omap_l3_noc.c |0 {arch/arm/mach-omap2 => drivers/bus}/omap_l3_noc.h |0 {arch/arm/mach-omap2 => drivers/bus}/omap_l3_smx.c |0 {arch/arm/mach-omap2 => drivers/bus}/omap_l3_smx.h |0 8 files changed, 11 insertions(+), 5 deletions(-) rename {arch/arm/mach-omap2 => drivers/bus}/omap_l3_noc.c (100%) rename {arch/arm/mach-omap2 => drivers/bus}/omap_l3_noc.h (100%) rename {arch/arm/mach-omap2 => drivers/bus}/omap_l3_smx.c (100%) rename {arch/arm/mach-omap2 => drivers/bus}/omap_l3_smx.h (100%) diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig index dd2db02..7d3c8ab 100644 --- a/arch/arm/mach-omap2/Kconfig +++ b/arch/arm/mach-omap2/Kconfig @@ -44,6 +44,7 @@ config ARCH_OMAP3 select ARM_CPU_SUSPEND if PM select MULTI_IRQ_HANDLER select SOC_HAS_OMAP2_SDRC + select OMAP_INTERCONNECT config ARCH_OMAP4 bool "TI OMAP4" @@ -63,6 +64,7 @@ config ARCH_OMAP4 select USB_ARCH_HAS_EHCI if USB_SUPPORT select ARM_CPU_SUSPEND if PM select ARCH_NEEDS_CPU_IDLE_COUPLED + select OMAP_INTERCONNECT config SOC_OMAP5 bool "TI OMAP5" diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile index f6a24b3..7fed980 100644 --- a/arch/arm/mach-omap2/Makefile +++ b/arch/arm/mach-omap2/Makefile @@ -199,11 +199,6 @@ obj-$(CONFIG_ARCH_OMAP4) += omap_hwmod_44xx_data.o # EMU peripherals obj-$(CONFIG_OMAP3_EMU)+= emu.o -# L3 interconnect -obj-$(CONFIG_ARCH_OMAP3) += omap_l3_smx.o -obj-$(CONFIG_ARCH_OMAP4) += omap_l3_noc.o -obj-$(CONFIG_SOC_OMAP5)+= omap_l3_noc.o - obj-$(CONFIG_OMAP_MBOX_FWK)+= mailbox_mach.o mailbox_mach-objs := mailbox.o diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig index 6270415..bbec35d 100644 --- a/drivers/bus/Kconfig +++ b/drivers/bus/Kconfig @@ -12,4 +12,10 @@ config OMAP_OCP2SCP OCP2SCP and in OMAP5, both USB PHY and SATA PHY is connected via OCP2SCP. +config OMAP_INTERCONNECT + tristate "OMAP INTERCONNECT DRIVER" + depends on ARCH_OMAP2PLUS + + help + Driver to enable OMAP interconnect error handling driver. endmenu diff --git a/drivers/bus/Makefile b/drivers/bus/Makefile index 0ec50bc..45d997c 100644 --- a/drivers/bus/Makefile +++ b/drivers/bus/Makefile @@ -3,3 +3,6 @@ # obj-$(CONFIG_OMAP_OCP2SCP) += omap-ocp2scp.o + +# Interconnect bus driver for OMAP SoCs. +obj-$(CONFIG_OMAP_INTERCONNECT)+= omap_l3_smx.o omap_l3_noc.o diff --git a/arch/arm/mach-omap2/omap_l3_noc.c b/drivers/bus/omap_l3_noc.c similarity index 100% rename from arch/arm/mach-omap2/omap_l3_noc.c rename to drivers/bus/omap_l3_noc.c diff --git a/arch/arm/mach-omap2/omap_l3_noc.h b/drivers/bus/omap_l3_noc.h similarity index 100% rename from arch/arm/mach-omap2/omap_l3_noc.h rename to drivers/bus/omap_l3_noc.h diff --git a/arch/arm/mach-omap2/omap_l3_smx.c b/drivers/bus/omap_l3_smx.c similarity index 100% rename from arch/arm/mach-omap2/omap_l3_smx.c rename to drivers/bus/omap_l3_smx.c diff --git a/arch/arm/mach-omap2/omap_l3_smx.h b/drivers/bus/omap_l3_smx.h similarity index 100% rename from arch/arm/mach-omap2/omap_l3_smx.h rename to drivers/bus/omap_l3_smx.h -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: Push selects for TWD/SCU into machine entries
On Thursday 04 October 2012 02:20 PM, Stephen Boyd wrote: The TWD and SCU configs are selected by default as long as SCORPIONMP is false and/or MCT is false. Implementing the logic this way certainly saves lines in the Kconfig but it precludes those machines which select SCORPIONMP or MCT from participating in the single zImage effort because when those machines are combined with other SMP capable machines the TWD and SCU are no longer selected. Push the select out to the machine entries so that we can compile these machines together and still select the appropriate configs. Signed-off-by: Stephen Boyd Cc: David Brown Cc: Kukjin Kim Cc: Linus Walleij Cc: Pawel Moll Cc: Rob Herring Cc: Russell King Cc: Sascha Hauer Cc: Shiraz Hashim Cc: Simon Horman Cc: Srinidhi Kasagar Cc: Stephen Warren Cc: Tony Lindgren Cc: Viresh Kumar --- Does OMAP5 need to select TWD? I suspect not if it uses the architected timers. Nope. OMAP5 don't use TWD. Infact the external SCU is also used for A9 SOCs. You might want to check other A15 SOCS for SCU as well. [..] arch/arm/mach-omap2/Kconfig| 4 [..] diff --git a/arch/arm/mach-omap2/Kconfig b/arch/arm/mach-omap2/Kconfig index a6219ea..b618748 100644 --- a/arch/arm/mach-omap2/Kconfig +++ b/arch/arm/mach-omap2/Kconfig @@ -58,7 +58,9 @@ config ARCH_OMAP4 select CPU_V7 select ARM_GIC select HAVE_SMP + select HAVE_ARM_SCU if SMP select LOCAL_TIMERS if SMP + select HAVE_ARM_TWD if LOCAL_TIMERS select PL310_ERRATA_588369 select PL310_ERRATA_727915 select ARM_ERRATA_720789 Ok. @@ -75,6 +77,8 @@ config SOC_OMAP5 select CPU_V7 select ARM_GIC select HAVE_SMP + select HAVE_ARM_SCU if SMP + select HAVE_ARM_TWD if LOCAL_TIMERS select ARM_CPU_SUSPEND if PM select SOC_HAS_REALTIME_COUNTER select ARM_ARCH_TIMER Drop this change. With that fixed, for OMAP changes Acked-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: Push selects for TWD/SCU into machine entries
On Friday 05 October 2012 06:42 AM, Simon Horman wrote: On Thu, Oct 04, 2012 at 02:41:46PM +0530, Santosh Shilimkar wrote: On Thursday 04 October 2012 02:20 PM, Stephen Boyd wrote: The TWD and SCU configs are selected by default as long as SCORPIONMP is false and/or MCT is false. Implementing the logic this way certainly saves lines in the Kconfig but it precludes those machines which select SCORPIONMP or MCT from participating in the single zImage effort because when those machines are combined with other SMP capable machines the TWD and SCU are no longer selected. Push the select out to the machine entries so that we can compile these machines together and still select the appropriate configs. Signed-off-by: Stephen Boyd Cc: David Brown Cc: Kukjin Kim Cc: Linus Walleij Cc: Pawel Moll Cc: Rob Herring Cc: Russell King Cc: Sascha Hauer Cc: Shiraz Hashim Cc: Simon Horman Cc: Srinidhi Kasagar Cc: Stephen Warren Cc: Tony Lindgren Cc: Viresh Kumar --- Does OMAP5 need to select TWD? I suspect not if it uses the architected timers. Nope. OMAP5 don't use TWD. Infact the external SCU is also used for A9 SOCs. You might want to check other A15 SOCS for SCU as well. In that case I am a bit confused by the following result: # git checkout v3.6 # ARCH=arm make omap2plus_defconfig # grep '(SOC_OMAP5|_SCU|_TWD|CONFIG_SMP|CONFIG_LOCAL_TIMERS)=' .config CONFIG_SOC_OMAP5=y CONFIG_SMP=y CONFIG_HAVE_ARM_SCU=y CONFIG_HAVE_ARM_TWD=y CONFIG_LOCAL_TIMERS=y Thats because omap2plus_defconfig build all the OMAP machines together including OMAP4 and OMAP5. Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 2/6] sched: add a new SD SHARE_POWERLINE flag for sched_domain
Vincent, Few comments/questions. On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: This new flag SD SHARE_POWERLINE reflects the sharing of the power rail between the members of a domain. As this is the current assumption of the scheduler, the flag is added to all sched_domain Signed-off-by: Vincent Guittot --- arch/ia64/include/asm/topology.h |1 + arch/tile/include/asm/topology.h |1 + include/linux/sched.h|1 + include/linux/topology.h |3 +++ kernel/sched/core.c |5 + 5 files changed, 11 insertions(+) diff --git a/arch/ia64/include/asm/topology.h b/arch/ia64/include/asm/topology.h index a2496e4..065c720 100644 --- a/arch/ia64/include/asm/topology.h +++ b/arch/ia64/include/asm/topology.h @@ -65,6 +65,7 @@ void build_cpu_to_node_map(void); | SD_BALANCE_EXEC \ | SD_BALANCE_FORK \ | SD_WAKE_AFFINE, \ + | arch_sd_share_power_line()\ .last_balance = jiffies, \ .balance_interval = 1,\ .nr_balance_failed = 0,\ diff --git a/arch/tile/include/asm/topology.h b/arch/tile/include/asm/topology.h index 7a7ce39..d39ed0b 100644 --- a/arch/tile/include/asm/topology.h +++ b/arch/tile/include/asm/topology.h @@ -72,6 +72,7 @@ static inline const struct cpumask *cpumask_of_node(int node) | 0*SD_PREFER_LOCAL \ | 0*SD_SHARE_CPUPOWER \ | 0*SD_SHARE_PKG_RESOURCES \ + | arch_sd_share_power_line()\ | 0*SD_SERIALIZE\ , \ .last_balance = jiffies, \ diff --git a/include/linux/sched.h b/include/linux/sched.h index 4786b20..74f2daf 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -862,6 +862,7 @@ enum cpu_idle_type { #define SD_WAKE_AFFINE0x0020 /* Wake task to waking CPU */ #define SD_PREFER_LOCAL 0x0040 /* Prefer to keep tasks local to this domain */ #define SD_SHARE_CPUPOWER 0x0080 /* Domain members share cpu power */ +#define SD_SHARE_POWERLINE 0x0100 /* Domain members share power domain */ If you ignore the current use of SD_SHARE_CPUPOWER, isn't the meaning of CPUPOWER and POWERLINE is same here. Just trying to understand the clear meaning of this new flag. Have you not considered SD_SHARE_CPUPOWER because it is being used for cpu_power and needs at least minimum two domains ? SD_PACKING would have been probably more appropriate based on the way it is being used in further series. Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 3/6] sched: pack small tasks
Vincent, Few comments/questions. On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: During sched_domain creation, we define a pack buddy CPU if available. On a system that share the powerline at all level, the buddy is set to -1 On a dual clusters / dual cores system which can powergate each core and cluster independantly, the buddy configuration will be : | CPU0 | CPU1 | CPU2 | CPU3 | --- buddy | CPU0 | CPU0 | CPU0 | CPU2 | ^ Is that a typo ? Should it be CPU2 instead of CPU0 ? Small tasks tend to slip out of the periodic load balance. The best place to choose to migrate them is at their wake up. I have tried this series since I was looking at some of these packing bits. On Mobile workloads like OSIdle with Screen ON, MP3, gallary, I did see some additional filtering of threads with this series but its not making much difference in power. More on this below. Signed-off-by: Vincent Guittot --- kernel/sched/core.c |1 + kernel/sched/fair.c | 109 ++ kernel/sched/sched.h |1 + 3 files changed, 111 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index dab7908..70cadbe 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6131,6 +6131,7 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu) rcu_assign_pointer(rq->sd, sd); destroy_sched_domains(tmp, cpu); + update_packing_domain(cpu); update_top_cache_domain(cpu); } diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 4f4a4f6..8c9d3ed 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -157,6 +157,63 @@ void sched_init_granularity(void) update_sysctl(); } + +/* + * Save the id of the optimal CPU that should be used to pack small tasks + * The value -1 is used when no buddy has been found + */ +DEFINE_PER_CPU(int, sd_pack_buddy); + +/* Look for the best buddy CPU that can be used to pack small tasks + * We make the assumption that it doesn't wort to pack on CPU that share the s/wort/worth + * same powerline. We looks for the 1st sched_domain without the + * SD_SHARE_POWERLINE flag. Then We look for the sched_group witht the lowest + * power per core based on the assumption that their power efficiency is + * better */ Commenting style.. /* * */ Can you please expand the why the assumption is right ? "it doesn't wort to pack on CPU that share the same powerline" Think about a scenario where you have quad core, ducal cluster system |Cluster1| |cluster 2| | CPU0 | CPU1 | CPU2 | CPU3 | | CPU0 | CPU1 | CPU2 | CPU3 | Both clusters run from same voltage rail and have same PLL clocking them. But the cluster have their own power domain and all CPU's can power gate them-self to low power states. Clusters also have their own level2 caches. In this case, you will still save power if you try to pack load on one cluster. No ? +void update_packing_domain(int cpu) +{ + struct sched_domain *sd; + int id = -1; + + sd = highest_flag_domain(cpu, SD_SHARE_POWERLINE); + if (!sd) + sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd); + else + sd = sd->parent; + + while (sd) { + struct sched_group *sg = sd->groups; + struct sched_group *pack = sg; + struct sched_group *tmp = sg->next; + + /* 1st CPU of the sched domain is a good candidate */ + if (id == -1) + id = cpumask_first(sched_domain_span(sd)); + + /* loop the sched groups to find the best one */ + while (tmp != sg) { + if (tmp->sgp->power * sg->group_weight < + sg->sgp->power * tmp->group_weight) + pack = tmp; + tmp = tmp->next; + } + + /* we have found a better group */ + if (pack != sg) + id = cpumask_first(sched_group_cpus(pack)); + + /* Look for another CPU than itself */ + if ((id != cpu) +|| ((sd->parent) && !(sd->parent->flags && SD_LOAD_BALANCE))) Is the condition "!(sd->parent->flags && SD_LOAD_BALANCE)" for big.LITTLE kind of system where SD_LOAD_BALANCE may not be used ? + break; + + sd = sd->parent; + } + + pr_info(KERN_INFO "CPU%d packing on CPU%d\n", cpu, id); + per_cpu(sd_pack_buddy, cpu) = id; +} + #if BITS_PER_LONG == 32 # define WMULT_CONST (~0UL) #else @@ -3073,6 +3130,55 @@ static int select_idle_sibling(struct task_struct *p, int target) return target; } +static inline bool is_buddy_busy(int cpu) +{ + struct rq *rq = cpu_rq(cpu); + + /* +* A busy buddy is a CPU with a high load or a small load with a lot of
Re: [RFC 4/6] sched: secure access to other CPU statistics
$subject is bit confusing here. On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: The atomic update of runnable_avg_sum and runnable_avg_period are ensured by their size and the toolchain. But we must ensure to not read an old value for one field and a newly updated value for the other field. As we don't want to lock other CPU while reading these fields, we read twice each fields and check that no change have occured in the middle. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8c9d3ed..6df53b5 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3133,13 +3133,28 @@ static int select_idle_sibling(struct task_struct *p, int target) static inline bool is_buddy_busy(int cpu) { struct rq *rq = cpu_rq(cpu); + volatile u32 *psum = &rq->avg.runnable_avg_sum; + volatile u32 *pperiod = &rq->avg.runnable_avg_period; + u32 sum, new_sum, period, new_period; + int timeout = 10; So it can be 2 times read or more as well. + + while (timeout) { + sum = *psum; + period = *pperiod; + new_sum = *psum; + new_period = *pperiod; + + if ((sum == new_sum) && (period == new_period)) + break; + + timeout--; + } Seems like you did notice incorrect pair getting read for rq runnable_avg_sum and runnable_avg_period. Seems like the fix is to update them together under some lock to avoid such issues. Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 5/6] sched: pack the idle load balance
On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: Look for an idle CPU close the pack buddy CPU whenever possible. s/close/close to The goal is to prevent the wake up of a CPU which doesn't share the power line of the pack CPU Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6df53b5..f76acdc 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5158,7 +5158,25 @@ static struct { static inline int find_new_ilb(int call_cpu) { + struct sched_domain *sd; int ilb = cpumask_first(nohz.idle_cpus_mask); + int buddy = per_cpu(sd_pack_buddy, call_cpu); + + /* +* If we have a pack buddy CPU, we try to run load balance on a CPU +* that is close to the buddy. +*/ + if (buddy != -1) + for_each_domain(buddy, sd) { + if (sd->flags & SD_SHARE_CPUPOWER) + continue; Do you mean SD_SHARE_POWERLINE here ? + + ilb = cpumask_first_and(sched_domain_span(sd), + nohz.idle_cpus_mask); + + if (ilb < nr_cpu_ids) + break; + } if (ilb < nr_cpu_ids && idle_cpu(ilb)) return ilb; Can you please expand "idle CPU _close_ the pack buddy CPU" ? Regards santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 6/6] ARM: sched: clear SD_SHARE_POWERLINE
On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: The ARM platforms take advantage of packing small tasks on few cores. This is true even when the cores of a cluster can't be powergated independently. Signed-off-by: Vincent Guittot --- arch/arm/kernel/topology.c |5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c index 26c12c6..00511d0 100644 --- a/arch/arm/kernel/topology.c +++ b/arch/arm/kernel/topology.c @@ -226,6 +226,11 @@ static inline void update_cpu_power(unsigned int cpuid, unsigned int mpidr) {} */ struct cputopo_arm cpu_topology[NR_CPUS]; +int arch_sd_share_power_line(void) +{ + return 0*SD_SHARE_POWERLINE; +} Making this selection of policy based on sched domain will better. Just gives the flexibility to choose a separate scheme for big and little systems which will be very convenient. Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFT] ARM64: dma-mapping: support debug_dma_mapping_error
On Friday 26 October 2012 04:17 AM, Shuah Khan wrote: Add support for debug_dma_mapping_error() call to avoid warning from debug_dma_unmap() interface when it checks for mapping error checked status. Without this patch, device driver failed to check map error warning is generated. Signed-off-by: Shuah Khan --- Looks good. Similar fix for 32bit ARM port is done in below commit. From 871ae57adc5ed092c1341f411514d0e8482e2611 Mon Sep 17 00:00:00 2001 From: Ming Lei Date: Mon, 22 Oct 2012 20:44:03 +0800 Subject: [PATCH] ARM: dma-mapping: support debug_dma_mapping_error Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] ARM: OMAP4: ID: Improve features detection and check
On Thursday 01 November 2012 03:53 PM, Ivan Khoronzhuk wrote: Replaces several flags bearing the same meaning. There is no need to set flags due to different omap types here, it can be checked in appropriate places as well. Cc: Tony Lindgren Cc: Russell King Cc: linux-o...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ivan Khoronzhuk --- arch/arm/mach-omap2/id.c | 25 +++-- arch/arm/mach-omap2/soc.h |8 ++-- 2 files changed, 9 insertions(+), 24 deletions(-) diff --git a/arch/arm/mach-omap2/id.c b/arch/arm/mach-omap2/id.c index cf2362c..3c47a19 100644 --- a/arch/arm/mach-omap2/id.c +++ b/arch/arm/mach-omap2/id.c @@ -28,6 +28,9 @@ #include "soc.h" #include "control.h" +#define OMAP4_SILICON_TYPE_STANDARD0x01 +#define OMAP4_SILICON_TYPE_PERFORMANCE 0x02 + static unsigned int omap_revision; static const char *cpu_rev; u32 omap_features; @@ -273,25 +276,11 @@ void __init omap4xxx_check_features(void) { u32 si_type; - if (cpu_is_omap443x()) - omap_features |= OMAP4_HAS_MPU_1GHZ; - + si_type = + (read_tap_reg(OMAP4_CTRL_MODULE_CORE_STD_FUSE_PROD_ID_1) >> 16) & 0x03; - if (cpu_is_omap446x()) { - si_type = - read_tap_reg(OMAP4_CTRL_MODULE_CORE_STD_FUSE_PROD_ID_1); - switch ((si_type & (3 << 16)) >> 16) { - case 2: - /* High performance device */ - omap_features |= OMAP4_HAS_MPU_1_5GHZ; - break; - case 1: - default: - /* Standard device */ - omap_features |= OMAP4_HAS_MPU_1_2GHZ; - break; - } - } + if (si_type == OMAP4_SILICON_TYPE_PERFORMANCE) + omap_features = OMAP4_HAS_PERF_SILICON; Well the detection isn't for performance/standard but there are some features depend on it. For example 1 GHz doesn't DPLL DCC enable feature where as 1.2 GHz, 1.5 GHz doesn't need. This is the main reason this information is also effused. Have you considered this aspect while creating this patch ? Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] ARM: OMAP4: ID: Improve features detection and check
On Thursday 01 November 2012 09:50 PM, ivan.khoronzhuk wrote: On 11/01/2012 01:35 PM, Santosh Shilimkar wrote: On Thursday 01 November 2012 03:53 PM, Ivan Khoronzhuk wrote: Replaces several flags bearing the same meaning. There is no need to set flags due to different omap types here, it can be checked in appropriate places as well. Cc: Tony Lindgren Cc: Russell King Cc: linux-o...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ivan Khoronzhuk --- arch/arm/mach-omap2/id.c | 25 +++-- arch/arm/mach-omap2/soc.h |8 ++-- 2 files changed, 9 insertions(+), 24 deletions(-) diff --git a/arch/arm/mach-omap2/id.c b/arch/arm/mach-omap2/id.c index cf2362c..3c47a19 100644 --- a/arch/arm/mach-omap2/id.c +++ b/arch/arm/mach-omap2/id.c @@ -28,6 +28,9 @@ #include "soc.h" #include "control.h" +#define OMAP4_SILICON_TYPE_STANDARD0x01 +#define OMAP4_SILICON_TYPE_PERFORMANCE0x02 + static unsigned int omap_revision; static const char *cpu_rev; u32 omap_features; @@ -273,25 +276,11 @@ void __init omap4xxx_check_features(void) { u32 si_type; -if (cpu_is_omap443x()) -omap_features |= OMAP4_HAS_MPU_1GHZ; - +si_type = +(read_tap_reg(OMAP4_CTRL_MODULE_CORE_STD_FUSE_PROD_ID_1) >> 16) & 0x03; -if (cpu_is_omap446x()) { -si_type = - read_tap_reg(OMAP4_CTRL_MODULE_CORE_STD_FUSE_PROD_ID_1); -switch ((si_type & (3 << 16)) >> 16) { -case 2: -/* High performance device */ -omap_features |= OMAP4_HAS_MPU_1_5GHZ; -break; -case 1: -default: -/* Standard device */ -omap_features |= OMAP4_HAS_MPU_1_2GHZ; -break; -} -} +if (si_type == OMAP4_SILICON_TYPE_PERFORMANCE) +omap_features = OMAP4_HAS_PERF_SILICON; Well the detection isn't for performance/standard but there are some features depend on it. For example 1 GHz doesn't DPLL DCC enable feature where as 1.2 GHz, 1.5 GHz doesn't need. This is the main reason this information is also effused. Have you considered this aspect while creating this patch ? Regards Santosh I had considered it. There is no dependency on the features. DCC usage depends on asked frequency on the fly, not from the features. Depending on "performance/standard" feature the available frequencies should be chosen in places where they are needed, for example while initializing OPPs. You are correct about the way DCC is handled in the clock code. Infact all these features like L2CACHE, SGX, IVA etc is more for to display boot messages. Currently we have several features while it is only one indeed. 1GHz, 1.2GHz, 1.5 GHz are not same since the silicon capability itself is different. git blame tells me that Nishant has sent this update so looping him if above differentiation in boot log helps. Nishant, What's your take on removing above freq prints and marking those silicon as performance silicons as the $subject patch does ? Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH] ARM: OMAP4: ID: Improve features detection and check
On Thursday 01 November 2012 10:36 PM, Nishanth Menon wrote: On 22:05-20121101, Santosh Shilimkar wrote: On Thursday 01 November 2012 09:50 PM, ivan.khoronzhuk wrote: On 11/01/2012 01:35 PM, Santosh Shilimkar wrote: On Thursday 01 November 2012 03:53 PM, Ivan Khoronzhuk wrote: Replaces several flags bearing the same meaning. There is no need to set flags due to different omap types here, it can be checked in appropriate places as well. Cc: Tony Lindgren Cc: Russell King Cc: linux-o...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ivan Khoronzhuk --- [..] +if (si_type == OMAP4_SILICON_TYPE_PERFORMANCE) +omap_features = OMAP4_HAS_PERF_SILICON; Well the detection isn't for performance/standard but there are some features depend on it. For example 1 GHz doesn't DPLL DCC enable feature where as 1.2 GHz, 1.5 GHz doesn't need. This is the main reason this information is also effused. Have you considered this aspect while creating this patch ? Regards Santosh I had considered it. There is no dependency on the features. DCC usage depends on asked frequency on the fly, not from the features. Depending on "performance/standard" feature the available frequencies should be chosen in places where they are needed, for example while initializing OPPs. You are correct about the way DCC is handled in the clock code. Infact all these features like L2CACHE, SGX, IVA etc is more for to display boot messages. Currently we have several features while it is only one indeed. 1GHz, 1.2GHz, 1.5 GHz are not same since the silicon capability itself is different. git blame tells me that Nishant has sent this update so looping him if above differentiation in boot log helps. Nishant, What's your take on removing above freq prints and marking those silicon as performance silicons as the $subject patch does ? Regards Santosh Yes $subject patch is a better approach compared to having freq based handling which just increases the number of macros we need to enable depending on SoC variants that we spin off the main SoC. This also allows us to conserve the features bitfield ahead as well. I hate to admit, but after a couple of generations of SoC spinoffs, my original logic is proving to be was pretty short sighted, unfortunately :( So, approach Acked-by: Nishanth Menon Thanks Nishant for clarification and ack. With the clarification I also like the subject patch. Feel free add. Acked-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 2/6] sched: add a new SD SHARE_POWERLINE flag for sched_domain
On Monday 29 October 2012 03:20 PM, Vincent Guittot wrote: It looks like i need to describe more what On 29 October 2012 10:40, Vincent Guittot wrote: On 24 October 2012 17:17, Santosh Shilimkar wrote: Vincent, Few comments/questions. On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: This new flag SD SHARE_POWERLINE reflects the sharing of the power rail between the members of a domain. As this is the current assumption of the scheduler, the flag is added to all sched_domain Signed-off-by: Vincent Guittot --- arch/ia64/include/asm/topology.h |1 + arch/tile/include/asm/topology.h |1 + include/linux/sched.h|1 + include/linux/topology.h |3 +++ kernel/sched/core.c |5 + 5 files changed, 11 insertions(+) diff --git a/arch/ia64/include/asm/topology.h b/arch/ia64/include/asm/topology.h index a2496e4..065c720 100644 --- a/arch/ia64/include/asm/topology.h +++ b/arch/ia64/include/asm/topology.h @@ -65,6 +65,7 @@ void build_cpu_to_node_map(void); | SD_BALANCE_EXEC \ | SD_BALANCE_FORK \ | SD_WAKE_AFFINE, \ + | arch_sd_share_power_line()\ .last_balance = jiffies, \ .balance_interval = 1,\ .nr_balance_failed = 0,\ diff --git a/arch/tile/include/asm/topology.h b/arch/tile/include/asm/topology.h index 7a7ce39..d39ed0b 100644 --- a/arch/tile/include/asm/topology.h +++ b/arch/tile/include/asm/topology.h @@ -72,6 +72,7 @@ static inline const struct cpumask *cpumask_of_node(int node) | 0*SD_PREFER_LOCAL \ | 0*SD_SHARE_CPUPOWER \ | 0*SD_SHARE_PKG_RESOURCES \ + | arch_sd_share_power_line()\ | 0*SD_SERIALIZE\ , \ .last_balance = jiffies, \ diff --git a/include/linux/sched.h b/include/linux/sched.h index 4786b20..74f2daf 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -862,6 +862,7 @@ enum cpu_idle_type { #define SD_WAKE_AFFINE0x0020 /* Wake task to waking CPU */ #define SD_PREFER_LOCAL 0x0040 /* Prefer to keep tasks local to this domain */ #define SD_SHARE_CPUPOWER 0x0080 /* Domain members share cpu power */ +#define SD_SHARE_POWERLINE 0x0100 /* Domain members share power domain */ If you ignore the current use of SD_SHARE_CPUPOWER, isn't the meaning of CPUPOWER and POWERLINE is same here. Just trying to understand the clear meaning of this new flag. Have you not considered SD_SHARE_CPUPOWER because it is being used for cpu_power and needs at least minimum two domains ? SD_PACKING would have been probably more appropriate based on the way it is being used in further series. CPUPOWER reflects the share of hw ressources between cores like for hyper threading. POWERLINE describes the fact that cores are sharing the same power line amore precisely the powergate. Sorry, the mail has been sent too early while I was writing it CPUPOWER reflects the share of hw ressources between cores like for hyper threading. POWERLINE describes the fact that cores are sharing the same power line and more precisely the same power gating. It looks like I need to describe more precisely what i would mean with SHARE_POWERLINE. Yes. More description will help. I see bit of overlap POWERLINE flag with SD_SHARE_CPUPOWER and SD_SHARE_PKG_RESOURCES and hence the questions. I don't want to use PACKING because it's more a behavior than a feature. If cores can power gate independently (!SD_SHARE_POWERLINE), packing small tasks is one interesting behavior but it may be not the only one. I want to make a difference between the HW configuration and the behavior we want to have above it Fair enough. Thanks for clarification. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 3/6] sched: pack small tasks
On Monday 29 October 2012 06:42 PM, Vincent Guittot wrote: On 24 October 2012 17:20, Santosh Shilimkar wrote: Vincent, Few comments/questions. On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: During sched_domain creation, we define a pack buddy CPU if available. On a system that share the powerline at all level, the buddy is set to -1 On a dual clusters / dual cores system which can powergate each core and cluster independantly, the buddy configuration will be : | CPU0 | CPU1 | CPU2 | CPU3 | --- buddy | CPU0 | CPU0 | CPU0 | CPU2 | ^ Is that a typo ? Should it be CPU2 instead of CPU0 ? No it's not a typo. The system packs at each scheduling level. It starts to pack in cluster because each core can power gate independently so CPU1 tries to pack its tasks in CPU0 and CPU3 in CPU2. Then, it packs at CPU level so CPU2 tries to pack in the cluster of CPU0 and CPU0 packs in itself I get it. Though in above example a task may migrate from say CPU3->CPU2->CPU0 as part of packing. I was just thinking whether moving such task from say CPU3 to CPU0 might be best instead. Small tasks tend to slip out of the periodic load balance. The best place to choose to migrate them is at their wake up. I have tried this series since I was looking at some of these packing bits. On Mobile workloads like OSIdle with Screen ON, MP3, gallary, I did see some additional filtering of threads with this series but its not making much difference in power. More on this below. Can I ask you which configuration you have used ? how many cores and cluster ? Can they be power gated independently ? I have been trying with couple of setups. Dual Core ARM machine and Quad core X86 box with single package thought most of the mobile workload analysis I was doing on ARM machine. On both setups CPUs can be gated independently. Signed-off-by: Vincent Guittot --- kernel/sched/core.c |1 + kernel/sched/fair.c | 109 ++ kernel/sched/sched.h |1 + 3 files changed, 111 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index dab7908..70cadbe 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6131,6 +6131,7 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu) rcu_assign_pointer(rq->sd, sd); destroy_sched_domains(tmp, cpu); + update_packing_domain(cpu); update_top_cache_domain(cpu); } diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 4f4a4f6..8c9d3ed 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -157,6 +157,63 @@ void sched_init_granularity(void) update_sysctl(); } + +/* + * Save the id of the optimal CPU that should be used to pack small tasks + * The value -1 is used when no buddy has been found + */ +DEFINE_PER_CPU(int, sd_pack_buddy); + +/* Look for the best buddy CPU that can be used to pack small tasks + * We make the assumption that it doesn't wort to pack on CPU that share the s/wort/worth yes + * same powerline. We looks for the 1st sched_domain without the + * SD_SHARE_POWERLINE flag. Then We look for the sched_group witht the lowest + * power per core based on the assumption that their power efficiency is + * better */ Commenting style.. /* * */ yes Can you please expand the why the assumption is right ? "it doesn't wort to pack on CPU that share the same powerline" By "share the same power-line", I mean that the CPUs can't power off independently. So if some CPUs can't power off independently, it's worth to try to use most of them to race to idle. In that case I suggest we use different word here. Power line can be treated as voltage line, power domain. May be SD_SHARE_CPU_POWERDOMAIN ? Think about a scenario where you have quad core, ducal cluster system |Cluster1| |cluster 2| | CPU0 | CPU1 | CPU2 | CPU3 | | CPU0 | CPU1 | CPU2 | CPU3 | Both clusters run from same voltage rail and have same PLL clocking them. But the cluster have their own power domain and all CPU's can power gate them-self to low power states. Clusters also have their own level2 caches. In this case, you will still save power if you try to pack load on one cluster. No ? yes, I need to update the description of SD_SHARE_POWERLINE because I'm afraid I was not clear enough. SD_SHARE_POWERLINE includes the power gating capacity of each core. For your example above, the SD_SHARE_POWERLINE shoud be cleared at both MC and CPU level. Thanks for clarification. +void update_packing_domain(int cpu) +{ + struct sched_domain *sd; + int id = -1; + + sd = highest_flag_domain(cpu, SD_SHARE_POWERLINE); + if (!sd) + sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd); + else + sd = sd->
Re: [RFC 5/6] sched: pack the idle load balance
On Monday 29 October 2012 06:57 PM, Vincent Guittot wrote: On 24 October 2012 17:21, Santosh Shilimkar wrote: On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: Look for an idle CPU close the pack buddy CPU whenever possible. s/close/close to yes The goal is to prevent the wake up of a CPU which doesn't share the power line of the pack CPU Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6df53b5..f76acdc 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5158,7 +5158,25 @@ static struct { static inline int find_new_ilb(int call_cpu) { + struct sched_domain *sd; int ilb = cpumask_first(nohz.idle_cpus_mask); + int buddy = per_cpu(sd_pack_buddy, call_cpu); + + /* +* If we have a pack buddy CPU, we try to run load balance on a CPU +* that is close to the buddy. +*/ + if (buddy != -1) + for_each_domain(buddy, sd) { + if (sd->flags & SD_SHARE_CPUPOWER) + continue; Do you mean SD_SHARE_POWERLINE here ? No, I just don't want to take hyperthread level for ILB + + ilb = cpumask_first_and(sched_domain_span(sd), + nohz.idle_cpus_mask); + + if (ilb < nr_cpu_ids) + break; + } if (ilb < nr_cpu_ids && idle_cpu(ilb)) return ilb; Can you please expand "idle CPU _close_ the pack buddy CPU" ? The goal is to packed the tasks on the pack buddy CPU so when the scheduler needs to start ILB, I try to wake up a CPU that is close to the buddy and preferably in the same cluster I see your point now. Thanks for clarification. Regards santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 6/6] ARM: sched: clear SD_SHARE_POWERLINE
On Monday 29 October 2012 06:58 PM, Vincent Guittot wrote: On 24 October 2012 17:21, Santosh Shilimkar wrote: On Sunday 07 October 2012 01:13 PM, Vincent Guittot wrote: The ARM platforms take advantage of packing small tasks on few cores. This is true even when the cores of a cluster can't be powergated independently. Signed-off-by: Vincent Guittot --- arch/arm/kernel/topology.c |5 + 1 file changed, 5 insertions(+) diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c index 26c12c6..00511d0 100644 --- a/arch/arm/kernel/topology.c +++ b/arch/arm/kernel/topology.c @@ -226,6 +226,11 @@ static inline void update_cpu_power(unsigned int cpuid, unsigned int mpidr) {} */ struct cputopo_arm cpu_topology[NR_CPUS]; +int arch_sd_share_power_line(void) +{ + return 0*SD_SHARE_POWERLINE; +} Making this selection of policy based on sched domain will better. Just gives the flexibility to choose a separate scheme for big and little systems which will be very convenient. I agree that it would be more flexible to be able to set it for each level Will you be addressing that in next version then ? Regards santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ARM: OMAP4: Basic default configuration for Pandaboard
On Saturday 27 October 2012 05:33 PM, Constantine Shulyupin wrote: From: Constantine Shulyupin Tested SD (MMC) and Ethernet smsc95xx on linux-3.7-rc2 Signed-off-by: Constantine Shulyupin -- kernel size is 2.3 MiB --- arch/arm/configs/omap4_panda_defconfig | 115 We got rid of all omap board defconfigs and now just using omap2plus_defconfig for all OMAP2/3/4/5 machines. If something is missing from omap2plus_defconfig, which you need to on your panda custom config, do send patch for that. Otherwise, we are no longer adding board defconfigs. Regards Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3] gpio: interrupt consistency check for OF GPIO IRQs
On Tuesday 27 August 2013 04:17 PM, Stephen Warren wrote: > On 08/26/2013 08:07 AM, Lars Poeschel wrote: >> From: Linus Walleij >> >> Currently the kernel is ambigously treating GPIOs and interrupts >> from a GPIO controller: GPIOs and interrupts are treated as >> orthogonal. This unfortunately makes it unclear how to actually >> retrieve and request a GPIO line or interrupt from a GPIO >> controller in the device tree probe path. > > I still think that this patch is the wrong approach. Instead, the logic > should be hooked into gpio_request() and request_irq(). This patch only > addresses DT, and ignores anything else, hence doesn't seem like the > right level of abstraction to plug in, since the issue is not related to DT. > On the issue, Rajendra cooked up a patch [1] to handle the issue at irqdomain level which was shot down after further discussion. It might be useful for the discussion here. Ofcourse it was also targeted to solve only DT issue. Regards, Santosh [1] "[PATCH] irqdomain: Do a gpio_request in case of a gpio irq_chip" http://www.mail-archive.com/linux-omap@vger.kernel.org/msg83451.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/6] DRIVERS: IRQCHIP: Add support for crossbar IP
On Tuesday 01 October 2013 09:48 AM, Rob Herring wrote: > On 10/01/2013 06:13 AM, Sricharan R wrote: >> Hi, >> >> On Monday 30 September 2013 08:39 PM, Rob Herring wrote: >>> On 09/30/2013 08:59 AM, Sricharan R wrote: Some socs have a large number of interrupts requests to service the needs of its many peripherals and subsystems. All of the interrupt requests lines from the subsystems are not needed at the same time, so they have to be muxed to the controllers appropriately. In such places a interrupt controllers are preceded by an IRQ CROSSBAR that provides flexibility in muxing the device interrupt requests to the controller inputs. This series models the peripheral interrupts that can be routed through the crossbar to the GIC as 'routable-irqs'. The routable irqs are added in a separate linear domain inside the GIC. The registered routable domain's callback are invoked as a part of the GIC's callback, which in turn should allocate a free irq line and configure the IP accordingly. So every peripheral in the dts files mentions the fixed crossbar number as its interrupt. A free gic line for that gets allocated and configured when the peripheral's interrupt is mapped. The minimal crossbar driver to track and allocate free GIC lines and configure the crossbar is added here, along with the DT bindings. >>> Seems like interrupt-map property is what you need here. >>> >>> http://devicetree.org/Device_Tree_Usage#Advanced_Interrupt_Mapping >>> >>> Versatile Express also has an example. >>OK, but the idea was not to tie up the crossbar<->interrupt numbers at the >>DTS level, but to assign it dynamically during runtime. This was one of >> the >> comments that came up with first crossbar support patches, which was >> assigning a >> interrupt line to crossbar number in the DTS and setting it up in crossbar >> probe. > > Is there an actual usecase on a single h/w design that you run out of > interrupts and it is a user decision which interrupts to use? > Yes. There are 240 peripheral interrupts connected out of which 160 can be used in this specific case. > You could fill in the interrupt-map at run-time. It would have to be > early (bootloader or early kernel init) and can't be at request_irq time. > Well all options are tried before coming up to the $subject solution. It was suggested by Thomas in the last review. >> https://lkml.org/lkml/2013/7/18/416 >> >>Since this approach of assigning in DTS was opposed, we moved to IRQCHIP >> and >>that did not go as well. Finally was asked to handle this as a part of >> GIC driver with >>a separate domain. >> >> http://www.spinics.net/lists/linux-omap/msg97085.html > > This has nothing to do with the GIC, so it does not belong there. > Well the router makes connections from peripheral to GIC. Thomas can better explain it but I think since its doing irq routing for GIC on a given hardware, I don't see any issue having some generic map/unmap function in GIC. The actual implementation is still outside of GIC. Regards, Sasntosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/6] DRIVERS: IRQCHIP: Add support for crossbar IP
On Tuesday 01 October 2013 10:53 AM, Rob Herring wrote: > On 10/01/2013 08:57 AM, Santosh Shilimkar wrote: >> On Tuesday 01 October 2013 09:48 AM, Rob Herring wrote: >>> On 10/01/2013 06:13 AM, Sricharan R wrote: >>>> Hi, >>>> >>>> On Monday 30 September 2013 08:39 PM, Rob Herring wrote: >>>>> On 09/30/2013 08:59 AM, Sricharan R wrote: >>>>>> Some socs have a large number of interrupts requests to service >>>>>> the needs of its many peripherals and subsystems. All of the interrupt >>>>>> requests lines from the subsystems are not needed at the same >>>>>> time, so they have to be muxed to the controllers appropriately. >>>>>> In such places a interrupt controllers are preceded by an >>>>>> IRQ CROSSBAR that provides flexibility in muxing the device interrupt >>>>>> requests to the controller inputs. >>>>>> >>>>>> This series models the peripheral interrupts that can be routed through >>>>>> the crossbar to the GIC as 'routable-irqs'. The routable irqs are added >>>>>> in a separate linear domain inside the GIC. The registered routable >>>>>> domain's >>>>>> callback are invoked as a part of the GIC's callback, which in turn >>>>>> should >>>>>> allocate a free irq line and configure the IP accordingly. So every >>>>>> peripheral >>>>>> in the dts files mentions the fixed crossbar number as its interrupt. A >>>>>> free >>>>>> gic line for that gets allocated and configured when the peripheral's >>>>>> interrupt >>>>>> is mapped. >>>>>> >>>>>> The minimal crossbar driver to track and allocate free GIC lines and >>>>>> configure the >>>>>> crossbar is added here, along with the DT bindings. >>>>> Seems like interrupt-map property is what you need here. >>>>> >>>>> http://devicetree.org/Device_Tree_Usage#Advanced_Interrupt_Mapping >>>>> >>>>> Versatile Express also has an example. >>>>OK, but the idea was not to tie up the crossbar<->interrupt numbers at >>>> the >>>>DTS level, but to assign it dynamically during runtime. This was one of >>>> the >>>> comments that came up with first crossbar support patches, which was >>>> assigning a >>>> interrupt line to crossbar number in the DTS and setting it up in >>>> crossbar probe. >>> >>> Is there an actual usecase on a single h/w design that you run out of >>> interrupts and it is a user decision which interrupts to use? >>> >> Yes. There are 240 peripheral interrupts connected out of which 160 can >> be used in this specific case. > > Yes, I understand the SOC connections. That does not answer my question. > The 240 interrupts are likely to be limited to fewer by board design, > pinmuxing, etc. > yes limited by different board designs ... > How do you handle the 161st interrupt request? Will never happen? Just > rely on the random driver probe ordering? > Well the board dts are expected to provide the peripheral support info to optimise it. If a board actually has more than 160 peripherals available then in that case the 161 interrupt will not be mapped. >>> You could fill in the interrupt-map at run-time. It would have to be >>> early (bootloader or early kernel init) and can't be at request_irq time. >>> >> Well all options are tried before coming up to the $subject solution. >> It was suggested by Thomas in the last review. >> >>>> https://lkml.org/lkml/2013/7/18/416 >>>> >>>>Since this approach of assigning in DTS was opposed, we moved to >>>> IRQCHIP and >>>>that did not go as well. Finally was asked to handle this as a part of >>>> GIC driver with >>>>a separate domain. >>>> >>>> http://www.spinics.net/lists/linux-omap/msg97085.html >>> >>> This has nothing to do with the GIC, so it does not belong there. >>> >> Well the router makes connections from peripheral to GIC. Thomas can >> better explain it but I think since its doing irq routing for GIC on >> a given hardware, I don't see any issue having some generic map/unmap >> function in GIC. The actual implementation is still outside of GIC. > > I rea
[PATCH] sched_clock: fix postinit no sched_clock function check
The sched_clock code uses 2 levels of function pointers, sched_clock_func() and read_sched_clock() but the no sched_clock check in postinit() just checks read_sched_clock(). This leads to kernel falling back to jiffy based sched clock even in presence of sched_clock_func() which is not desirable. Fix the postinit() check to avoid the issue. Probably the issue is hidden so far on most of the arm SOCs because of already existing sched_clock registrations apart from arch_timer sched_clock. One can reproduce the issue by just have arch_timer as sched_clock Cc: Stephen Boyd Cc: John Stultz Cc: Russell King Cc: Will Deacon Cc: Rob Herring Cc: Thomas Gleixner Signed-off-by: Santosh Shilimkar --- kernel/time/sched_clock.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c index 0b479a6..15c4d78 100644 --- a/kernel/time/sched_clock.c +++ b/kernel/time/sched_clock.c @@ -179,7 +179,8 @@ void __init sched_clock_postinit(void) * If no sched_clock function has been provided at that point, * make it the final one one. */ - if (read_sched_clock == jiffy_sched_clock_read) + if ((read_sched_clock == jiffy_sched_clock_read) && + (sched_clock_func == sched_clock_32)) setup_sched_clock(jiffy_sched_clock_read, 32, HZ); sched_clock_poll(sched_clock_timer.data); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched_clock: fix postinit no sched_clock function check
On Wednesday 02 October 2013 01:09 PM, Will Deacon wrote: > On Wed, Oct 02, 2013 at 05:55:28PM +0100, Santosh Shilimkar wrote: >> The sched_clock code uses 2 levels of function pointers, sched_clock_func() >> and read_sched_clock() but the no sched_clock check in postinit() just >> checks read_sched_clock(). >> >> This leads to kernel falling back to jiffy based sched clock even in >> presence of sched_clock_func() which is not desirable. >> >> Fix the postinit() check to avoid the issue. Probably the issue is hidden >> so far on most of the arm SOCs because of already existing sched_clock >> registrations apart from arch_timer sched_clock. One can reproduce the >> issue by just have arch_timer as sched_clock > > Isn't this just an issue with the arch timer driver not calling > setup_sched_clock? Instead, we munge around with sched_clock_func directly, > which doesn't appear to be the way anybody else deals with this. > I thought about that option as well but was not sure since even in that case the check is not complete. We just ensure that function is popullated. > I'm not sure of the history though, so perhaps there's a reason for this... > Am curious as well. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched_clock: fix postinit no sched_clock function check
On Wednesday 02 October 2013 01:22 PM, Stephen Boyd wrote: > On 10/02/13 10:14, Santosh Shilimkar wrote: >> On Wednesday 02 October 2013 01:09 PM, Will Deacon wrote: >>> On Wed, Oct 02, 2013 at 05:55:28PM +0100, Santosh Shilimkar wrote: >>>> The sched_clock code uses 2 levels of function pointers, sched_clock_func() >>>> and read_sched_clock() but the no sched_clock check in postinit() just >>>> checks read_sched_clock(). >>>> >>>> This leads to kernel falling back to jiffy based sched clock even in >>>> presence of sched_clock_func() which is not desirable. >>>> >>>> Fix the postinit() check to avoid the issue. Probably the issue is hidden >>>> so far on most of the arm SOCs because of already existing sched_clock >>>> registrations apart from arch_timer sched_clock. One can reproduce the >>>> issue by just have arch_timer as sched_clock >>> Isn't this just an issue with the arch timer driver not calling >>> setup_sched_clock? Instead, we munge around with sched_clock_func directly, >>> which doesn't appear to be the way anybody else deals with this. >>> >> I thought about that option as well but was not sure since even in that case >> the check is not complete. We just ensure that function is popullated. > > Yes, nothing is actually broken because sched_clock_func() won't try to > use the jiffy based read_sched_clock() function. I'm not sure we > actually need this patch besides to remove a useless timer that updates > the jiffy epoch. Can we wait until my 64-bit sched_clock patch series > lands in 3.13? It looks like I still need an ack from Will or Catalin on > the architected timer patch before the clocksource folks pick it up. > Really... I have not created patch out of fun. Its broken on my keystone machine at least where the sched_clock is falling back on jiffy based sched_clock even in presence of arch_timer sched_clock. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched_clock: fix postinit no sched_clock function check
On Wednesday 02 October 2013 01:48 PM, Will Deacon wrote: > On Wed, Oct 02, 2013 at 06:42:40PM +0100, Stephen Boyd wrote: >> On 10/02/13 10:27, Santosh Shilimkar wrote: >>> Really... I have not created patch out of fun. >>> Its broken on my keystone machine at least where the sched_clock is >>> falling back on jiffy based sched_clock even in presence of arch_timer >>> sched_clock. >> >> How is that possible? sched_clock_func is only assigned by >> arch/arm/kernel/arch_timer.c when the architected timer is detected and >> sched_clock() in kernel/time/sched_clock.c calls that function pointer >> unconditionally. The only way I see this happening is if the architected >> timer rate is zero. > ^^ > > *cough* CNTFRQ *cough* > :) CNTFRQ as such is fine. I think the below print mis-lead me mostly. sched_clock: ARM arch timer >56 bits at 6144kHz, resolution 162ns sched_clock: 32 bits at 100 Hz, resolution 1000ns, wraps every 4294967286ms So yes, now the subject patch actually just avoids the jiffy sched_clock() registration and nothing else. Even without the patch arch_timer sched_clock will be in use. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Use of drivers/platform and matching include?
On Friday 04 October 2013 12:48 PM, Olof Johansson wrote: > On Fri, Oct 4, 2013 at 6:22 AM, Greg Kroah-Hartman > wrote: >> On Fri, Oct 04, 2013 at 12:41:28PM +0100, Russell King - ARM Linux wrote: >>> >>> So, no, there will be no new drivers under arch/arm. They must be in the >>> drivers subtree somewhere. >> >> I have no objection with this, and encourage it. > > Ok, so these are some of the requirements as far as I see it: > > * No per-vendor driver dumping ground under drivers/* (i.e. no > drivers/platform//) > * No weirdly constructed single-driver directories directly under > drivers/* (we already have a few and should look at moving those) > because nothing else fits > * We need some sort of convention on dependencies. Several of these > are more libraries than drivers, i.e. we'll have cross-calls for > things like queue management, resource allocation, etc. So having a > single location to hold most of these makes sense instead of > everything cross-depending on everything else. > > Based on the above, how about we create something like > drivers/resourcemgr to hold these? I think at least parts of the > mvebu-mbus driver that ended up under drivers/bus might be a fit to > move there. The APM queue allocator would likely be a fit, and maybe > some of the qualcomm stuff. Kumar, what are your thoughts on that? Slightly different question but relevant to th thread w.r.t the Queue allocator/manager. We are also interested for TI Keystone SOCs. Currently we have generic drivers/hwqueue/ with core hwqueue layer implementing the standard hardware queue descriptor push/pop and notification mechanisms and then Keystone specific driver using those core functionalities. I read most of the networking SOCs has some sort of hwqueues but not sure about the its implementations. So just thought of bringing it into the thread discussion to see if hwqueue core layer is of any interest to other SOCs. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/2] ARM: OMAP: Add secure function omap_smc3() which calling instruction smc #1
On Thursday 11 July 2013 03:43 PM, Ивайло Димитров wrote: > > > > > > > Оригинално писмо > >От: Dave Martin > >Относно: Re: [PATCH v2 1/2] ARM: OMAP: Add secure function omap_smc3() which > calling instruction smc #1 > >До: Pali Rohár > >Изпратено на: Сряда, 2013, Юли 10 20:45:26 EEST > > > > > >On Wed, Jul 10, 2013 at 02:59:04PM +0200, Pali Rohár wrote: > >> Other secure functions omap_smc1() and omap_smc2() calling instruction > smc #0 > >> but Nokia RX-51 board needs to call smc #1 for PPA access. > >> > >> Signed-off-by: Ivaylo Dimitrov > >> Signed-off-by: Pali Rohár > >> --- > >> arch/arm/mach-omap2/omap-secure.h |1 + > >> arch/arm/mach-omap2/omap-smc.S| 22 +- > >> 2 files changed, 22 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/arm/mach-omap2/omap-secure.h > b/arch/arm/mach-omap2/omap-secure.h > >> index 0e72917..c4586f4 100644 > >> --- a/arch/arm/mach-omap2/omap-secure.h > >> +++ b/arch/arm/mach-omap2/omap-secure.h > >> @@ -51,6 +51,7 @@ > >> extern u32 omap_secure_dispatcher(u32 idx, u32 flag, u32 nargs, > >> u32 arg1, u32 arg2, u32 arg3, u32 arg4); > >> extern u32 omap_smc2(u32 id, u32 falg, u32 pargs); > >> +extern u32 omap_smc3(u32 id, u32 process, u32 flag, u32 pargs); > >> extern phys_addr_t omap_secure_ram_mempool_base(void); > >> extern int omap_secure_ram_reserve_memblock(void); > >> > >> diff --git a/arch/arm/mach-omap2/omap-smc.S > b/arch/arm/mach-omap2/omap-smc.S > >> index f6441c1..5c02b8d 100644 > >> --- a/arch/arm/mach-omap2/omap-smc.S > >> +++ b/arch/arm/mach-omap2/omap-smc.S > >> @@ -1,9 +1,11 @@ > >> /* > >> - * OMAP44xx secure APIs file. > >> + * OMAP34xx and OMAP44xx secure APIs file. > >> * > >> * Copyright (C) 2010 Texas Instruments, Inc. > >> * Written by Santosh Shilimkar > >> * > >> + * Copyright (C) 2012 Ivaylo Dimitrov > >> + * Copyright (C) 2013 Pali Rohár > >> * > >> * This program is free software,you can redistribute it and/or modify > >> * it under the terms of the GNU General Public License version 2 as > >> @@ -54,6 +56,24 @@ ENTRY(omap_smc2) > >> ldmfd sp!, {r4-r12, pc} > >> ENDPROC(omap_smc2) > >> > >> +/** > >> + * u32 omap_smc3(u32 service_id, u32 process_id, u32 flag, u32 pargs) > >> + * Low level common routine for secure HAL and PPA APIs via smc #1 > >> + * r0 - @service_id: Secure Service ID > >> + * r1 - @process_id: Process ID > >> + * r2 - @flag: Flag to indicate the criticality of operation > >> + * r3 - @pargs: Physical address of parameter list > >> + */ > >> +ENTRY(omap_smc3) > >> + stmfd sp!, {r4-r12, lr} > > > >You don't need to save/restore r12. The ABI allows it to be clobbered > >across function calls. > > > >> + mov r12, r0 @ Copy the secure service ID > >> + mov r6, #0xff @ Indicate new Task call > >> + dsb > >> + dmb > > > >dsb synchronises a superset of what dmb synchronises, so the dmb here is > >not useful. > > > >In any case, any code calling this must flush the region addressed by > >r3 beforehand anyway, which will include a dsb as part of its semantics > >-- this is how you call it from rx51_secure_dispatcher(). > > > >So I think the dsb may not be needed here (?) > > > >Cheers > >---Dave > > > > > > Could be, but I wonder why almost all the kernel code(I am aware of) that > uses SMC and is written by TI, is storing r12 and is using both DSB and DMB. > See arch/arm/mach-omap2/sleep34xx.S or arch/arm/mach-omap2/omap-smc.S for > examples. I'd rather play it safe and leave it that way, unless someone > confirms the other code using SMC has extra DSB/DMB instructions too. I > wouldn't risk passing invalid/stale data to the Secure Monitor to just save 8 > bytes and barriers in a performance non-critical code which will be called > only a couple of times during the boot-up process. r12 save/restore is a > legacy from omap_smc2 in omap-smc.S, so I guess it can go away without much > of a trouble. > Dave pointed out about the dsb and r12 to me in another thread. R12 can be easily removed but the DSB's were needed on OMAP for power sequencing. Without those, we have seen many issues. So you can leave the dsb's to be consistent with rest of the code. Regards Santosh P.S: Please sensibly wrap your message while replying. You reply apears like one single line and I needed to keep scrolling to read it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/RFT PATCH 3/5] scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations
DMA bounce limit is the maximum direct DMA'able memory beyond which bounce buffers has to be used to perform dma operations. SCSI driver relies on dma_mask but its calculation is based on max_*pfn which don't have uniform meaning across architectures. So make use of dma_max_pfn() which is expected to return the DMAable maximum pfn value across architectures. Cc: Russell King Cc: linux-s...@vger.kernel.org Signed-off-by: Santosh Shilimkar --- drivers/scsi/scsi_lib.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 86d5220..e8275fa 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1668,7 +1668,7 @@ u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost) host_dev = scsi_get_device(shost); if (host_dev && host_dev->dma_mask) - bounce_limit = *host_dev->dma_mask; + bounce_limit = dma_max_pfn(host_dev) << PAGE_SHIFT; return bounce_limit; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/RFT PATCH 5/5] ARM: mm: Remove bootmem code and switch to NO_BOOTMEM
In the effort of using memblock instead of bootmem allocator, ARM arch needs to be converted to use NO_BOOTMEM. With NO_BOOTMEM change, now we use memblock allocator to reserve space for crash kernel to have one less dependency with nobootmem allocator wrapper. Hopefully the NO_BOOTMEM memblock wrapper(nobootmem.c) will vanish in near future and archs can directly use memblock APIs. Ongoing thread on this topic is here: https://lkml.org/lkml/2013/6/29/77 Boot tested with both flat memory and sparse (faked) memory models with highmem enabled. LAPE systems with memory starting > 4GB still won't work but this is one of the step to solve that problem for ARM. Cc: Russell King Cc: Catalin Marinas Cc: Will Deacon Cc: Nicolas Pitre Cc: Tejun Heo Signed-off-by: Santosh Shilimkar --- arch/arm/Kconfig|1 + arch/arm/kernel/setup.c |2 +- arch/arm/mm/init.c | 58 ++- 3 files changed, 4 insertions(+), 57 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 0ac9be6..cff9a59 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -63,6 +63,7 @@ config ARM select OLD_SIGSUSPEND3 select OLD_SIGACTION select HAVE_CONTEXT_TRACKING + select NO_BOOTMEM help The ARM series is a line of low-power-consumption RISC chip designs licensed by ARM Ltd and targeted at embedded applications and diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 63af9a7..2ca4b90 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -805,7 +805,7 @@ static void __init reserve_crashkernel(void) if (ret) return; - ret = reserve_bootmem(crash_base, crash_size, BOOTMEM_EXCLUSIVE); + ret = memblock_reserve(crash_base, crash_size); if (ret < 0) { printk(KERN_WARNING "crashkernel reservation failed - " "memory is in use (0x%lx)\n", (unsigned long)crash_base); diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 588a2c1..84dd56c 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -153,58 +153,6 @@ static void __init find_limits(unsigned long *min, unsigned long *max_low, *max_high = bank_pfn_end(&mi->bank[mi->nr_banks - 1]); } -static void __init arm_bootmem_init(unsigned long start_pfn, - unsigned long end_pfn) -{ - struct memblock_region *reg; - unsigned int boot_pages; - phys_addr_t bitmap; - pg_data_t *pgdat; - - /* -* Allocate the bootmem bitmap page. This must be in a region -* of memory which has already been mapped. -*/ - boot_pages = bootmem_bootmap_pages(end_pfn - start_pfn); - bitmap = memblock_alloc_base(boot_pages << PAGE_SHIFT, L1_CACHE_BYTES, - __pfn_to_phys(end_pfn)); - - /* -* Initialise the bootmem allocator, handing the -* memory banks over to bootmem. -*/ - node_set_online(0); - pgdat = NODE_DATA(0); - init_bootmem_node(pgdat, __phys_to_pfn(bitmap), start_pfn, end_pfn); - - /* Free the lowmem regions from memblock into bootmem. */ - for_each_memblock(memory, reg) { - unsigned long start = memblock_region_memory_base_pfn(reg); - unsigned long end = memblock_region_memory_end_pfn(reg); - - if (end >= end_pfn) - end = end_pfn; - if (start >= end) - break; - - free_bootmem(__pfn_to_phys(start), (end - start) << PAGE_SHIFT); - } - - /* Reserve the lowmem memblock reserved regions in bootmem. */ - for_each_memblock(reserved, reg) { - unsigned long start = memblock_region_reserved_base_pfn(reg); - unsigned long end = memblock_region_reserved_end_pfn(reg); - - if (end >= end_pfn) - end = end_pfn; - if (start >= end) - break; - - reserve_bootmem(__pfn_to_phys(start), - (end - start) << PAGE_SHIFT, BOOTMEM_DEFAULT); - } -} - #ifdef CONFIG_ZONE_DMA unsigned long arm_dma_zone_size __read_mostly; @@ -242,7 +190,7 @@ void __init setup_dma_zone(struct machine_desc *mdesc) #endif } -static void __init arm_bootmem_free(unsigned long min, unsigned long max_low, +static void __init zone_sizes_init(unsigned long min, unsigned long max_low, unsigned long max_high) { unsigned long zone_size[MAX_NR_ZONES], zhole_size[MAX_NR_ZONES]; @@ -396,8 +344,6 @@ void __init bootmem_init(void) find_limits(&min, &max_low, &max_high); - arm_bootmem_init(min, max_low); - /* * Sparsemem tries to allocate bootmem in memory_present(), * so must be done after the fixed reservations @@ -414,7 +360,7 @@
[RFC/RFT PATCH 2/5] mm: dma-mapping: Add dma_max_pfn(dev) helper function
Most of the kernel assumes that PFN0 is the start of the physical memory (RAM). This assumptions is not true on most of the ARM SOCs and hence and if one try to update the ARM port to follow the assumptions, we end of breaking the dma bounce limit for few block layer drivers. One such example is trying to unify the meaning of max*_pfn on ARM as the bootmem layer expects, breaks few block layer driver dma bounce limit. To fix this problem, we introduce dma_max_pfn(dev) generic helper with a possibility of override from the architecture code. The helper converts a DMA bitmask of bits to a block PFN number. In all the generic cases, it is just "dev->dma_mask >> PAGE_SHIFT" and hence default behavior is maintained as is. Subsequent patches will make use of the helper. No functional change. Cc: Russell King Cc: Jens Axboe Signed-off-by: Santosh Shilimkar --- include/linux/dma-mapping.h |7 +++ 1 file changed, 7 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 94af418..68a7863 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -129,6 +129,13 @@ static inline int dma_set_seg_boundary(struct device *dev, unsigned long mask) return -EIO; } +#ifndef dma_max_pfn +static inline unsigned long dma_max_pfn(struct device *dev) +{ + return *dev->dma_mask >> PAGE_SHIFT; +} +#endif + static inline void *dma_zalloc_coherent(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t flag) { -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/RFT PATCH 0/5] mm: ARM nobootmem and few dma_mask fixes
The series is an attempt to move ARM port to NO_BOOTMEM. As discussed on list NO_BOOTMEM move needed updates to max*pfn meaning to be maximum PFNs but that breaks the dma_mask for few block layer drivers since ARM start of physical memory is not PFN0 unlike most of the architectures. Some more read on it is here: http://lwn.net/Articles/543408/ http://lwn.net/Articles/543424/ To address this issue, we introduce generic dma_max_pfn() helper which can be overridden from the architectures. Another intention behind move to nobootmem is also to convert ARM to switch to memblock and getting rid of bootmem allocator dependency which don't work for LPAE machines which has physical memory starting beyond 4 GB boundary. It needs changes to core kernel and also a new memblock API. More on this can be found here: https://lkml.org/lkml/2013/6/29/77 I have been trying to cook up these patches with kind help from Russell and we know series don't solve all the dma_mask bad assumptions. But at least I am hoping that it can get the ball rolling. Comments/testing help is welcome !! Cc: Russell King Cc: Catalin Marinas Cc: Will Deacon Cc: Nicolas Pitre Cc: Tejun Heo Cc: Jens Axboe Santosh Shilimkar (5): block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit() mm: dma-mapping: Add dma_max_pfn(dev) helper function scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations ARM: mm: change max*pfn to include the physical offset of memory ARM: mm: Remove bootmem code and switch to NO_BOOTMEM arch/arm/Kconfig |1 + arch/arm/include/asm/dma-mapping.h | 16 ++--- arch/arm/kernel/setup.c|2 +- arch/arm/mm/init.c | 68 block/blk-settings.c |8 ++--- drivers/scsi/scsi_lib.c|2 +- include/linux/dma-mapping.h|7 7 files changed, 32 insertions(+), 72 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/RFT PATCH 4/5] ARM: mm: change max*pfn to include the physical offset of memory
Most of the kernel code assumes that max*pfn is maximum pfns because the physical start of memory is expected to be PFN0. Since this assumption is not true on ARM architectures, the meaning of max*pfn is number of memory pages. This is done to keep drivers happy which are making use of of these variable to calculate the dma bounce limit using dma_mask. Now since we have a architecture override possibility for DMAable maximum pfns, lets make meaning of max*pfns as maximum pnfs on ARM as well. In the patch, the dma_to_pfn/pfn_to_dma() pair is hacked to take care of the physical memory offset. It is done this way just to enable testing since its understood that it can come in way of single zImage. Cc: Russell King Cc: Catalin Marinas Cc: Will Deacon Cc: Nicolas Pitre Signed-off-by: Santosh Shilimkar --- arch/arm/include/asm/dma-mapping.h | 16 arch/arm/mm/init.c | 10 -- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 5b579b9..b2d5937 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -47,12 +47,12 @@ static inline int dma_set_mask(struct device *dev, u64 mask) #ifndef __arch_pfn_to_dma static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) { - return (dma_addr_t)__pfn_to_bus(pfn); + return (dma_addr_t)__pfn_to_bus(pfn + PHYS_PFN_OFFSET); } static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) { - return __bus_to_pfn(addr); + return __bus_to_pfn(addr) - PHYS_PFN_OFFSET; } static inline void *dma_to_virt(struct device *dev, dma_addr_t addr) @@ -64,15 +64,16 @@ static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) { return (dma_addr_t)__virt_to_bus((unsigned long)(addr)); } + #else static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) { - return __arch_pfn_to_dma(dev, pfn); + return __arch_pfn_to_dma(dev, pfn + PHYS_PFN_OFFSET); } static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr) { - return __arch_dma_to_pfn(dev, addr); + return __arch_dma_to_pfn(dev, addr) - PHYS_PFN_OFFSET; } static inline void *dma_to_virt(struct device *dev, dma_addr_t addr) @@ -86,6 +87,13 @@ static inline dma_addr_t virt_to_dma(struct device *dev, void *addr) } #endif +/* The ARM override for dma_max_pfn() */ +static inline unsigned long dma_max_pfn(struct device *dev) +{ + return dma_to_pfn(dev, *dev->dma_mask); +} +#define dma_max_pfn(dev) dma_max_pfn(dev) + /* * DMA errors are defined by all-bits-set in the DMA address. */ diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c index 6833cbe..588a2c1 100644 --- a/arch/arm/mm/init.c +++ b/arch/arm/mm/init.c @@ -420,12 +420,10 @@ void __init bootmem_init(void) * This doesn't seem to be used by the Linux memory manager any * more, but is used by ll_rw_block. If we can get rid of it, we * also get rid of some of the stuff above as well. -* -* Note: max_low_pfn and max_pfn reflect the number of _pages_ in -* the system, not the maximum PFN. */ - max_low_pfn = max_low - PHYS_PFN_OFFSET; - max_pfn = max_high - PHYS_PFN_OFFSET; + min_low_pfn = min; + max_low_pfn = max_low; + max_pfn = max_high; } /* @@ -531,7 +529,7 @@ static inline void free_area_high(unsigned long pfn, unsigned long end) static void __init free_highpages(void) { #ifdef CONFIG_HIGHMEM - unsigned long max_low = max_low_pfn + PHYS_PFN_OFFSET; + unsigned long max_low = max_low_pfn; struct memblock_region *mem, *res; /* set highmem page free */ -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC/RFT PATCH 1/5] block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit()
The blk_queue_bounce_limit() API parameter 'dma_mask' is actually the maximum address the device can handle rather than a dma_mask. Rename it accordingly to avoid it being interpreted as dma_mask. No functional change. The idea is to fix the bad assumptions about dma_mask wherever it could be miss-interpreted. Cc: Russell King Cc: Jens Axboe Signed-off-by: Santosh Shilimkar --- block/blk-settings.c |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/block/blk-settings.c b/block/blk-settings.c index c50ecf0..026c151 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -195,17 +195,17 @@ EXPORT_SYMBOL(blk_queue_make_request); /** * blk_queue_bounce_limit - set bounce buffer limit for queue * @q: the request queue for the device - * @dma_mask: the maximum address the device can handle + * @max_addr: the maximum address the device can handle * * Description: *Different hardware can have different requirements as to what pages *it can do I/O directly to. A low level driver can call *blk_queue_bounce_limit to have lower memory pages allocated as bounce - *buffers for doing I/O to pages residing above @dma_mask. + *buffers for doing I/O to pages residing above @max_addr. **/ -void blk_queue_bounce_limit(struct request_queue *q, u64 dma_mask) +void blk_queue_bounce_limit(struct request_queue *q, u64 max_addr) { - unsigned long b_pfn = dma_mask >> PAGE_SHIFT; + unsigned long b_pfn = max_addr >> PAGE_SHIFT; int dma = 0; q->bounce_gfp = GFP_NOIO; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver
Thomas, On Friday 13 September 2013 10:55 AM, Santosh Shilimkar wrote: > On Friday 13 September 2013 10:24 AM, Thomas Gleixner wrote: [...] >> Before you dig into MSI, lets talk about irq domains first. >> >> GIC implements a legacy irq domain, i.e. a linear domain of all >> possible GIC interrupts with a 1:1 mapping. >> >> So why can't you make use of irq domains and have the whole routing >> business implemented sanely? >> >> What's needed is in gic_init_bases(): >> >>if (of_property_read(node, "routable_irqs", &nr_routable_irqs) { >>irq_domain_add_legacy(nr_gic_irqs); >>} else { >>irq_domain_add_legacy(nr_per_cpu_irqs); >>irq_domain_add_linear(nr_routable_irqs); >>} >> >> Now that separate domain has an xlate function which grabs a free GIC >> irq from a bitmap and returns the hardware irq number in the gic >> space. The map/unmap callbacks take care of setting up / tearing down >> the route in the crossbar. >> >> Thoughts? >> > This sounds pretty good idea. We will explore above option. > Thanks Thomas. > After further looking into this, the irqdomain approach lets us setup the map only once during the init. This is similar to the earlier approach of cross-bar driver where at probe time the router was setup. The whole debate started with the fact that we shouldn't fix the irq mapping at probe and should dynamically change the mapping based on [request/free]_irq() to be able to maximize the use of the IP. Since we have agreed now to move ahead with irdomain, i thought of mentioning it here. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver
Thomas, On Thursday 12 September 2013 04:18 PM, Thomas Gleixner wrote: > On Thu, 12 Sep 2013, Sricharan R wrote: >> Signed-off-by: Sricharan R >> --- >> There is lockdep warning during the boot. This is because we try to >> do one request_irq with in another and that results in kmalloc being >> called from an atomic context, which generates the warning. >> Any suggestions to overcome this will help. > > You can't be serious about that. You post a patch series with a > serious bug in it instead of asking in the first place how to solve > the problem at hand. > > So why do you actually need to call request_irq() from inside > request_irq() and how do you actually manage to do that at all? > > I have a hard time to figure out how request_irq() might call itself > recursive. And I have no intention to look at your patch to figure out > that you abuse an irq chip callback to do so, simply because that > would force me to use abusive language which is frowned upon nowadays. > This is an outcome of the discussion on earlier patch set [1] which tried to add IRQ event router functionality. From beginning I was against the idea because I also felt that we are abusing the irqchip infrastructure and force fitting the cross-bar to be behave like an irqchip. But Linus W and few others strongly felt it to make it an irqchip implementation. > Can you please explain what you are trying to solve without referring > to your existing broken implementation. > I will try to summaries the IP and its need here. On TI SOCs, we have an IP cross-bar which is essentially an even router(hardware). It can route the IRQ and DMA in existing implementation. Specifically for the IRQ case addressed here, the cross-bar IP sits between the interrupt controller and peripheral interrupts. CPU <-- GIC <- CROSSBAR <- PERIPHERAL IRQs Just to expand it better, cross-bar input IRQ lines are more than what a GIC IRQ controller can support. e.q Total 250 peripheral IRQ lines out of which GIC support only 160 IRQ lines. So the idea here is to dynamically map the IRQ lines at cross-bar level to pick based on request_irq() so that one can optimize the use of limited IRQ lines at the GIC level. Strictly speaking the need is just establish the IRQ connection from peripheral to GIC and thats achieved at the request_irq() level. Earlier approach was to statically build this connections using the DT information in a separate driver probe but it had limitations of fixing the IRQ map and taking away flexibility what this IP provide. Hope this gives better picture to you behind the patch series. Just for others to know, use of IRQCHIP also forced to have all the irqchip handler redirection which is pure redirection including the irq-handler. And since it is *fast path* I asked Sricharan to measure the latency which is around ~2 uS( 1GHz CPU) overhead for every interrupt just because of redirections. Regards, Santosh [1] https://lkml.org/lkml/2013/7/18/362 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver
On Thursday 12 September 2013 06:22 PM, Thomas Gleixner wrote: > On Thu, 12 Sep 2013, Santosh Shilimkar wrote: >> Specifically for the IRQ case addressed here, the cross-bar IP >> sits between the interrupt controller and peripheral interrupts. >> >> CPU <-- GIC <- CROSSBAR <- PERIPHERAL IRQs >> >> Just to expand it better, cross-bar input IRQ lines are more than >> what a GIC IRQ controller can support. >> e.q Total 250 peripheral IRQ lines out of which GIC support >> only 160 IRQ lines. >> >> So the idea here is to dynamically map the IRQ lines at >> cross-bar level to pick based on request_irq() so that one >> can optimize the use of limited IRQ lines at the GIC level. >> Strictly speaking the need is just establish the IRQ >> connection from peripheral to GIC and thats achieved >> at the request_irq() level. >> >> Earlier approach was to statically build this connections >> using the DT information in a separate driver probe but >> it had limitations of fixing the IRQ map and taking away >> flexibility what this IP provide. >> >> Hope this gives better picture to you behind the patch >> series. > > Yes. I halfways understand what you are trying to achieve. > > So CROSSBAR is a routing block between the peripheral and the GIC in > order to expand the number of possible interrupts. > > Now the real question is, how that expansion mechanism is supposed to > work. There are two possible scenarios: > > 1) Expand the number of handled interrupts beyond the GIC capacity: > >That requires a mechanism in CROSSBAR to map several CROSSBAR >interrupts to a particular GIC interrupt and provide a demux >mechanism to invoke the shared handlers. > This is not possible in hardware and not supported. Hardware has no notion of muxing multiple IRQ's to generate 1 IRQ or ack etc functionality. Its a simple MUX to tie knots between input and output wires. > 2) Provide a mapping mechanism between possibly 250 interrupt numbers >and a limitation of a total 160 active interrupts by the underlying >GIC. > This is the need and problem we are trying to solve. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver
On Thursday 12 September 2013 08:26 PM, Thomas Gleixner wrote: > On Thu, 12 Sep 2013, Santosh Shilimkar wrote: >> On Thursday 12 September 2013 06:22 PM, Thomas Gleixner wrote: >>> Now the real question is, how that expansion mechanism is supposed to >>> work. There are two possible scenarios: >>> >>> 1) Expand the number of handled interrupts beyond the GIC capacity: >>> >>>That requires a mechanism in CROSSBAR to map several CROSSBAR >>>interrupts to a particular GIC interrupt and provide a demux >>>mechanism to invoke the shared handlers. >>> >> This is not possible in hardware and not supported. Hardware has >> no notion of muxing multiple IRQ's to generate 1 IRQ or ack etc >> functionality. Its a simple MUX to tie knots between input and output >> wires. > > It's not a MUX. It's a ROUTING mechanism. That's similar to the > mechanisms which are used by MSI[X]. We assign arbitrary interrupt > numbers to a device and route them to some underlying limited hardware > interrupt controller. > >>> 2) Provide a mapping mechanism between possibly 250 interrupt numbers >>>and a limitation of a total 160 active interrupts by the underlying >>>GIC. >>> >> This is the need and problem we are trying to solve. > > Let me summarize: > >- GIC supports up to 160 interrupts > >- CROSSBAR supports up to 250 interrupts > >- CROSSBAR routes up to 160 out of 250 interrupts to the GIC ones > >- Drivers request a CROSSBAR interrupt number which must be mapped > to some arbitrary available GIC irq number > Correct. > So basically the CROSSBAR mechanism is pretty much the same as MSI[X] > just in a different flavour and with a different set of semantics and > limitations, i.e. poor mans MSI[X] with a new level of bogosity. > > So if CROSSBAR is going to be the new fangled SoC MSI[X] long term > equivalent then you better provide some infrastructure for that and > make the drivers ready to use it. Maybe check with the PCI/MSI folks > to share some of the interfaces. > > If that whole thing is another onetime HW designers wet dream, then > please go back to the limited but completely functional (Who is going > to use more than 160 peripheral interrupts) device tree model. I > really have no interest to support hardware designer brain farts. > Thanks for clear NAK for irqchip approach. I should have looped you in the discussion where I was also suggesting against the irqchip approach. We will try to look at MSI stuff but if its get too complicated am going to fall-back to the initial probe based approach to achieve the functionality. Thanks again for clear direction and useful discussion. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/4] DRIVERS: IRQCHIP: Add crossbar irqchip driver
On Friday 13 September 2013 10:24 AM, Thomas Gleixner wrote: > On Thu, 12 Sep 2013, Santosh Shilimkar wrote: >> On Thursday 12 September 2013 08:26 PM, Thomas Gleixner wrote: >>> Let me summarize: >>> >>>- GIC supports up to 160 interrupts >>> >>>- CROSSBAR supports up to 250 interrupts >>> >>>- CROSSBAR routes up to 160 out of 250 interrupts to the GIC ones >>> >>>- Drivers request a CROSSBAR interrupt number which must be mapped >>> to some arbitrary available GIC irq number >>> >> Correct. >> >>> So basically the CROSSBAR mechanism is pretty much the same as MSI[X] >>> just in a different flavour and with a different set of semantics and >>> limitations, i.e. poor mans MSI[X] with a new level of bogosity. >>> >>> So if CROSSBAR is going to be the new fangled SoC MSI[X] long term >>> equivalent then you better provide some infrastructure for that and >>> make the drivers ready to use it. Maybe check with the PCI/MSI folks >>> to share some of the interfaces. >>> >>> If that whole thing is another onetime HW designers wet dream, then >>> please go back to the limited but completely functional (Who is going >>> to use more than 160 peripheral interrupts) device tree model. I >>> really have no interest to support hardware designer brain farts. >>> >> Thanks for clear NAK for irqchip approach. I should have looped you >> in the discussion where I was also suggesting against the irqchip >> approach. We will try to look at MSI stuff but if its get too >> complicated am going to fall-back to the initial probe based >> approach to achieve the functionality. > > Before you dig into MSI, lets talk about irq domains first. > > GIC implements a legacy irq domain, i.e. a linear domain of all > possible GIC interrupts with a 1:1 mapping. > > So why can't you make use of irq domains and have the whole routing > business implemented sanely? > > What's needed is in gic_init_bases(): > >if (of_property_read(node, "routable_irqs", &nr_routable_irqs) { > irq_domain_add_legacy(nr_gic_irqs); >} else { > irq_domain_add_legacy(nr_per_cpu_irqs); > irq_domain_add_linear(nr_routable_irqs); >} > > Now that separate domain has an xlate function which grabs a free GIC > irq from a bitmap and returns the hardware irq number in the gic > space. The map/unmap callbacks take care of setting up / tearing down > the route in the crossbar. > > Thoughts? > This sounds pretty good idea. We will explore above option. Thanks Thomas. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sched_clock: fix postinit no sched_clock function check
On Wednesday 09 October 2013 07:59 PM, John Stultz wrote: > On 10/02/2013 11:07 AM, Santosh Shilimkar wrote: >> On Wednesday 02 October 2013 01:48 PM, Will Deacon wrote: >>> On Wed, Oct 02, 2013 at 06:42:40PM +0100, Stephen Boyd wrote: >>>> On 10/02/13 10:27, Santosh Shilimkar wrote: >>>>> Really... I have not created patch out of fun. >>>>> Its broken on my keystone machine at least where the sched_clock is >>>>> falling back on jiffy based sched_clock even in presence of arch_timer >>>>> sched_clock. >>>> How is that possible? sched_clock_func is only assigned by >>>> arch/arm/kernel/arch_timer.c when the architected timer is detected and >>>> sched_clock() in kernel/time/sched_clock.c calls that function pointer >>>> unconditionally. The only way I see this happening is if the architected >>>> timer rate is zero. >>> ^^ >>> >>> *cough* CNTFRQ *cough* >>> >> :) CNTFRQ as such is fine. I think the below print mis-lead me mostly. >> >> sched_clock: ARM arch timer >56 bits at 6144kHz, resolution 162ns >> sched_clock: 32 bits at 100 Hz, resolution 1000ns, wraps every >> 4294967286ms >> >> So yes, now the subject patch actually just avoids the jiffy sched_clock() >> registration and nothing else. Even without the patch arch_timer sched_clock >> will be in use. > > Just wanted to follow up here, as I've not been paying close attention. > Is this issue then resolved, or is something still needed to be queued > for 3.12/3.13? > There is no regression as I initially thought. Patch fixes the miss-leading sched_clock print and also prevents timer to handle wrapping which is not needed. So no big deal and I don't mind if we don't apply it. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v4 0/5] clk: dt: bindings for mux, divider & gate clocks
Mike, On Thursday 22 August 2013 01:53 AM, Mike Turquette wrote: > This series introduces binding definitions for common register-mapped > clock multiplexer, divider and gate IP blocks along with the > corresponding setup functions for matching DT data. The bindings are > similar to the struct definitions but please don't hold that against the > binding: the struct definitions closely model the hardware register > layout. > > The delta from v3 is very small and only consists of cosmetic changes > and one functional change for better error handling. > > Changes since v3: > * added Tested-by & Reviewed-by from Heiko > * replaced underscores with dashes in DT property names > * bail from of clock setup function early if of_iomap fails > * removed unecessary explict cast > > Mike Turquette (5): > clk: divider: replace bitfield width with mask > clk: of: helper for determining number of parent clocks > clk: dt: binding for basic multiplexer clock > clk: dt: binding for basic divider clock > clk: dt: binding for basic gate clock > I have used $subject series as the base for the Keystone ccf support and tested these patches on Keystone EVM. Will post those patches soon on the lists. For the series if you need more tested-by tags, feel free to add, Tested-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] gpio/omap: maintain GPIO and IRQ usage separately
On Tuesday 24 September 2013 08:36 PM, Javier Martinez Canillas wrote: > The GPIO OMAP controller pins can be used as IRQ and GPIO > independently so is necessary to keep track GPIO pins and > IRQ lines usage separately to make sure that the bank will > always be enabled while being used. > > Also move gpio_is_input() definition in preparation for the > next patch that setups the controller's irq_chip driver when > a caller requests an interrupt line. > > Signed-off-by: Javier Martinez Canillas > --- Considering GPIO core maintainer is fine with this approach, am fine with both of your patches. Again thanks a lot for fixing the long nagging issue. FWIW, Acked-by: Santosh Shilimkar -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/6] DRIVERS: IRQCHIP: Add support for crossbar IP
On Monday 30 September 2013 09:59 AM, Sricharan R wrote: > Some socs have a large number of interrupts requests to service > the needs of its many peripherals and subsystems. All of the interrupt > requests lines from the subsystems are not needed at the same > time, so they have to be muxed to the controllers appropriately. > In such places a interrupt controllers are preceded by an > IRQ CROSSBAR that provides flexibility in muxing the device interrupt > requests to the controller inputs. > > This series models the peripheral interrupts that can be routed through > the crossbar to the GIC as 'routable-irqs'. The routable irqs are added > in a separate linear domain inside the GIC. The registered routable domain's > callback are invoked as a part of the GIC's callback, which in turn should > allocate a free irq line and configure the IP accordingly. So every peripheral > in the dts files mentions the fixed crossbar number as its interrupt. A free > gic line for that gets allocated and configured when the peripheral's > interrupt > is mapped. > > The minimal crossbar driver to track and allocate free GIC lines and > configure the > crossbar is added here, along with the DT bindings. > You should have references to the previous discussions so that its easier for new reviewers to understand why you ended up the approach. I noticed you missed this in your last posts as well. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 1/6] DRIVERS: IRQCHIP: IRQ-GIC: Add support for routable irqs
On Monday 30 September 2013 10:16 AM, Marc Zyngier wrote: > On 30/09/13 14:59, Sricharan R wrote: >> In some socs the gic can be preceded by a crossbar IP which >> routes the peripheral interrupts to the gic inputs. The peripheral >> interrupts are associated with a fixed crossbar input line and the >> crossbar routes that to one of the free gic input line. >> >> The DT entries for peripherals provides the fixed crossbar input line >> as its interrupt number and the mapping code should associate this with >> a free gic input line. This patch adds the support inside the gic irqchip >> to handle such routable irqs. The routable irqs are registered in a linear >> domain. The registered routable domain's callback should be implemented >> to get a free irq and to configure the IP to route it. > > Isn't this just another chained interrupt controller? How is it GIC > specific? > No it isn't a irq controller rather a event router. Patch is missing reference to the previous discussion. Previous discussion is here [1] Regards, Santosh [1] https://lkml.org/lkml/2013/9/13/413 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] gpio/omap: auto-setup a GPIO when used as an IRQ
Javier, On Monday 23 September 2013 01:07 PM, Tony Lindgren wrote: > * Javier Martinez Canillas [130923 10:09]: >> On 09/23/2013 06:45 PM, Tony Lindgren wrote: >>> >>> Hmm does this still work for legacy platform data based >>> drivers that are doing gpio_request() first? >>> >> >> Yes it still work when booting using board files. I tested on my OMAP3 board >> and >> it worked in both DT and legacy booting mode. > > OK great. > >>> And what's the path for clearing things for PM when free_irq() >>> gets called? It seems that this would leave the GPIO bank >>> enabled causing a PM regression? >>> >> >> Indeed, I did set bank->mod_usage |= 1 << offset so the bank is enabled if >> the >> device goes to suspended and then resumed but I completely forget about the >> clearing path when the IRQ is freed. >> >> Which makes me think that we should probably maintain two usage variables, >> one >> for GPIO and another one for IRQ and check both of them on the >> suspend/resume pm >> functions. > > Yes that it seems that they should be treated separately. > As discussed on IRC, the patch as such is fine after the mentioned fixup, I would like to hear back if Linus W/Grant is fine with the approach. Not sure if I missed the discussion, but the proposed patch is deviation from traditional method of doing gpio_request() first up to perform other gpio operations. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC v2] gpio/omap: auto-setup a GPIO when used as an IRQ
On Tuesday 24 September 2013 11:45 AM, Balaji T K wrote: > On Tuesday 24 September 2013 09:10 PM, Tony Lindgren wrote: >> * Javier Martinez Canillas [130924 01:06]: >>> The OMAP GPIO controller HW requires a pin to be configured in GPIO >>> input mode in order to operate as an interrupt input. Since drivers >>> should not be aware of whether an interrupt pin is also a GPIO or not, >>> the HW should be fully configured/enabled as an IRQ if a driver solely >>> uses IRQ APIs such as request_irq(), and never calls any GPIO-related >>> APIs. As such, add the missing HW setup to the OMAP GPIO controller's >>> irq_chip driver. >>> >>> Since this bypasses the GPIO subsystem we have to ensure that another >>> caller won't be able to request the same GPIO pin that is used as an >>> IRQ and set its direction as output. Requesting the GPIO and setting >>> its direction as input is allowed though. >> >> Also please mention the regression that this fixes. So far we know >> that smsc911x for tobi and igep boards in mainline, and also the > >> MMC card detect for omap4 boards. > Hi Tony, > > Card detect on omap4 board (sdp and panda) is not based on omap gpio, > so I think this fix is not applicable for omap4. > > Card detect line for SD card goes to power IC on OMAP4 panda and SDP. > I confused Tony mostly. It was OMAP4 SPI based ethernet which uses the GPIO as an interrupt line. So for Panda, its Ethernet driver and not MMC. Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/