Re: [RFC PATCH 05/17] ARM: kernel: save/restore kernel IF
Hi Lorenzo, only a few comments at this stage. The sr_entry.S code is both exclusively .arm (using conditionals and long-distance adr, i.e. not Thumb2-clean), and it uses post-armv5 instructions (like wfi). Same for the other *.S code in the patch series. It's non-generic assembly within arch/arch/kernel/, wouldn't one better place this into arch/arm/mm/...-v[67].S ? Then, sr_suspend/sr_resume; these functions are "C-exported" and are directly calling cpu_do_suspend/do_resume to pass a supplied buffer; I've done that for one iteration of the hibernation patch, yes, but that was a bit sneaky and Russell stated then the interface is cpu_suspend/cpu_resume not the proc funcs directly. Unless _those_ have been changed they're also unsafe to call from C funcs (clobber all regs). Couldn't you simply use cpu_suspend/resume directly ? How much memory do all the pagedirs require that are being kept around ? Why does each core need a separate one, what would happen to just use a single "identity table" for all ? I understand you can't use swapper_pg_dir for idle, so a separate one has to be allocated, yet the question remains why per-cpu required ? I'm currently transitioning between jobs; will re-subscribe to arm-kernel under a different email address soon, this one is likely to stop working in August. Sorry the inconvenience and high-latency responses till then :( FrankH. On Thu, 7 Jul 2011, Lorenzo Pieralisi wrote: In order to define a common idle interface for the kernel to enter low power modes, this patch provides include files and code that manages OS calls for low power entry and exit. In ARM world processor HW is categorized as CPU and Cluster. Corresponding states defined by this common IF are: C-state [CPU state]: 0 - RUN MODE 1 - STANDBY 2 - DORMANT (not supported by this patch) 3 - SHUTDOWN R-state [CLUSTER state] 0 - RUN 1 - STANDBY (not supported by this patch) 2 - L2 RAM retention 3 - SHUTDOWN idle modes are entered through cpu_enter_idle(cstate, rstate, flags) [sr_entry.S] which could replace the current processor.idle entry in proc info, since it just executes wfi for shallow C-states. Cluster low-power states are reached if and only if all the CPUs in the cluster are in low-power mode. Only one cluster is supported at present, and the kernel infrastructure should be improved to allow multiple clusters to be defined and enumerated. Current page table dir and stack pointers are saved using a per-cpu variable; this scheme breaks as soon as clusters are added to the kernel. The code keeps a cpumask of alive CPUs and manages the state transitions accordingly. Most of the variables needed when the CPU is powered down (MMU off) are allocated through a platform hook: platform_context_pointer(unsigned int size) that returns memory flat-mapped by this patchset as strongly ordered to avoid toying with L2 cleaning when a single CPU enters lowpower. Fully tested on dual-core A9 cluster. Signed-off-by: Lorenzo Pieralisi --- arch/arm/include/asm/sr_platform_api.h | 28 arch/arm/kernel/sr_api.c | 197 + arch/arm/kernel/sr_entry.S | 213 3 files changed, 438 insertions(+), 0 deletions(-) create mode 100644 arch/arm/include/asm/sr_platform_api.h create mode 100644 arch/arm/kernel/sr_api.c create mode 100644 arch/arm/kernel/sr_entry.S diff --git a/arch/arm/include/asm/sr_platform_api.h b/arch/arm/include/asm/sr_platform_api.h new file mode 100644 index 000..32367be --- /dev/null +++ b/arch/arm/include/asm/sr_platform_api.h @@ -0,0 +1,28 @@ +/* + * Copyright (C) 2008-2011 ARM Limited + * + * Author(s): Jon Callan, Lorenzo Pieralisi + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + */ + +#ifndef ASMARM_SR_PLATFORM_API_H +#define ASMARM_SR_PLATFORM_API_H + +#define SR_SAVE_L2 (1 << 31) +#define SR_SAVE_SCU(1 << 30) +#define SR_SAVE_ALL(SR_SAVE_L2 | SR_SAVE_SCU) + +struct lp_state { + u16 cpu; + u16 cluster; +}; + +extern void (*sr_sleep)(void); +extern void (*arch_reset_handler(void))(void); +extern int cpu_enter_idle(unsigned cstate, unsigned rstate, unsigned flags); +extern void *platform_context_pointer(unsigned int); +#endif diff --git a/arch/arm/kernel/sr_api.c b/arch/arm/kernel/sr_api.c new file mode 100644 index 000..4e48f60 --- /dev/null +++ b/arch/arm/kernel/sr_api.c @@ -0,0 +1,197 @@ +/* + * Copyright (C) 2008-2011 ARM Limited + * + * Author(s): Jon Callan, Lorenzo Pieralisi + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#in
Re: [RFC PATCH 05/17] ARM: kernel: save/restore kernel IF
On Mon, 11 Jul 2011, Lorenzo Pieralisi wrote: On Fri, Jul 08, 2011 at 05:12:22PM +0100, Frank Hofmann wrote: Hi Lorenzo, only a few comments at this stage. The sr_entry.S code is both exclusively .arm (using conditionals and long-distance adr, i.e. not Thumb2-clean), and it uses post-armv5 instructions (like wfi). Same for the other *.S code in the patch series. It's non-generic assembly within arch/arch/kernel/, wouldn't one better place this into arch/arm/mm/...-v[67].S ? Yes, it is certainly something I should improve to make it more generic. Then, sr_suspend/sr_resume; these functions are "C-exported" and are directly calling cpu_do_suspend/do_resume to pass a supplied buffer; I've done that for one iteration of the hibernation patch, yes, but that was a bit sneaky and Russell stated then the interface is cpu_suspend/cpu_resume not the proc funcs directly. Unless _those_ have been changed they're also unsafe to call from C funcs (clobber all regs). Couldn't you simply use cpu_suspend/resume directly ? Well, short answer is no. On SMP we do need to save CPU registers but if just one single cpu is shutdown L2 is still on. cpu_suspend saves regs on the stack that has to be cleaned from L2 before shutting a CPU down which make things more complicated than they should. That's why I use cpu_do_{suspend,resume}, so that I can choose the memory used to save registers (uncached), but that's an abuse of API. There's the option to use a switch_stack() as Will Deacon has provided via his kexec series. Agreed not mainline yet but good for ideas. Will's diff is here: http://www.spinics.net/lists/arm-kernel/msg127951.html and that one restores the original sp after the "stack switched" function returns. If you use that to switch to a different (uncached) stack before doing cpu_suspend (and the 'idle' finisher through that), wouldn't that solve the problem ? When the suspend returns, the above would restore your cached stackpointer. I agree this is sneaky, but I wanted to avoid duplicating code that just saves registers. Maybe moving to an uncached stack might solve this problem if we want to reuse cpu_suspend from cpu idle, which is still an open point. (me speedtyping above ...) As you say ;-) How much memory do all the pagedirs require that are being kept around ? Why does each core need a separate one, what would happen to just use a single "identity table" for all ? I understand you can't use swapper_pg_dir for idle, so a separate one has to be allocated, yet the question remains why per-cpu required ? I just allocate a 1:1 mapping once cloned from init_mm, reused for all CPUs, with an additional mapping. Have started to wonder whether this facility (a 1:1-mapped pagedir for kernel text/data, or maybe even "non-user text/data") could/should be made available on a global scale; after all, both kexec, reset, hibernate, some forms of idle/suspend do require "some sort of that" to go through MMU reinitialization. I'm unfortunately not deep enough in the VM subsystem to say how exactly this best would have to look like. Merely mentioning this because it looks while everyone creates 1:1 mappings, there's ever so slight differences between how the 1:1 mapping(s) are created; we've seen: * (re)using current->active_mm->pgd * (re)using swapper_pg_dir (different lengths for the 1:1 section) * using a separately-allocated/initialized pgdir Such proliferation usually means there's a justified need to do that kind of thing. Just the gritty details haven't been sorted how everyone with that need could do _the same_ thing instead of reinventing various square wheels. The main reason why I've used swapper_pg_dir in the hibernation patch is because it's the only static one in the system (the hibernation hooks are in irqs-off codepaths and pgd_alloc isn't a good idea then), and happens to have "clear" lower sections (in the user area) so that one's not actually substituting anything when creating 1:1 mappings (and getting rid of them restores the "pristine" state). But this assumption only holds as long as swapper_pg_dir's use isn't changed. So a little creepy feeling remains. A generic "reset_mmu_pg_dir", wouldn't that be a good thing to have ? The array of pointers is there to save pgdir on idle entry, one per-cpu. If you're going through cpu_{do_}suspend/resume, the TTBRs are saved/restored anyway, what do you need to keep the virtual addresses around for ? I'm currently transitioning between jobs; will re-subscribe to arm-kernel under a different email address soon, this one is likely to stop working in August. Sorry the inconvenience and high-latency responses till then :( FrankH. May I thank you v
Re: [RFC PATCH 05/17] ARM: kernel: save/restore kernel IF
On Mon, 11 Jul 2011, Lorenzo Pieralisi wrote: [ ... ] The array of pointers is there to save pgdir on idle entry, one per-cpu. If you're going through cpu_{do_}suspend/resume, the TTBRs are saved/restored anyway, what do you need to keep the virtual addresses around for ? Because I switch mm before calling suspend, which is called with a cloned pgdir. I am not sure I can avoid that. On resume, you'll be restoring the same thread as was previously running, right ? If so, all you do there is copying current->active_mm->pgd to some other place ? Also, if you'd be using cpu_suspend(), would there still be a need for cpu_switch_mm() before ? It'd rather be a case of possibly calling that before the MMU-off sequence / cpu_resume() ? Or is it that you use the new pgdir to make a memory region uncacheable ? FrankH. ___ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev