On Mon 15-05-17 16:44:26, Pasha Tatashin wrote:
> On 05/15/2017 03:38 PM, Michal Hocko wrote:
> >I do not think this is the right approach. Your measurements just show
> >that sparc could have a more optimized memset for small sizes. If you
> >keep the same memset only for the parallel initializati
From: "Gautham R. Shenoy"
The current code in the cpuidle-powernv intialization only allows deep
stop states (indicated by OPAL_PM_STOP_INST_DEEP) which lose timebase
(indicated by OPAL_PM_TIMEBASE_STOP). This assumption goes back to
POWER8 time where deep states used to lose the timebase. Howeve
From: "Gautham R. Shenoy"
On Power9 DD1 due to a hardware bug the Power-Saving Level Status
field (PLS) of the PSSCR for a thread waking up from a deep state can
under-report if some other thread in the core is in a shallow stop
state. The scenario in which this can manifest is as follows:
From: "Gautham R. Shenoy"
The lower 8 bits of core_idle_state_ptr tracks the number of non-idle
threads in the core. This is supposed to be initialized to bit-map
corresponding to the threads_per_core. However, currently it is
initialized to PNV_CORE_IDLE_THREAD_BITS (0xFF). This is correct for
P
From: Akshay Adiga
Some of the SPR values (HID0, MSR, SPRG0) don't change during the run
time of a booted kernel, once they have been initialized.
The contents of these SPRs are lost when the CPUs enter deep stop
states. So instead saving and restoring SPRs from the kernel, use the
stop-api prov
From: "Gautham R. Shenoy"
On wakeup from a deep stop state which is supposed to lose the
hypervisor state, we don't restore the LPCR to the old value but set
it to a "sane" value via cur_cpu_spec->cpu_restore().
The problem is that the "sane" value doesn't include UPRT and the HR
bits which are
From: "Gautham R. Shenoy"
On POWER8, in case of
- nap: both timebase and hypervisor state is retained.
- fast-sleep: timebase is lost. But the hypervisor state is retained.
- winkle: timebase and hypervisor state is lost.
Hence, the current code for handling exit from a idle state as
From: "Gautham R. Shenoy"
Hi,
This patch series contains some of the fixes required for enabling
support for deep stop states such as STOP4 and STOP11 via CPU-Hotplug.
These fixes mainly ensure that some of the hypervisor resources which
are lost during the deep stop state are correctly restore
This moves the #ifdef in C code to a Kconfig dependency. Also we move the
gigantic_page_supported() function to be arch specific. This gives arch to
conditionally enable runtime allocation of gigantic huge page. Architectures
like ppc64 supports different gigantic huge page size (16G and 1G) based
POWER9 supports hugepages of size 2M and 1G in radix MMU mode. This patch
enables the usage of 1G page size for hugetlbfs. This also update the helper
such we can do 1G page allocation at runtime.
We still don't enable 1G page size on DD1 version. This is to avoid doing
workaround mentioned in com
HugeTLB migration support for PPC64
Changes from V1:
* Added Reviewed-by:
* Drop follow_huge_addr from powerpc
Aneesh Kumar K.V (8):
mm/hugetlb/migration: Use set_huge_pte_at instead of set_pte_at
mm/follow_page_mask: Split follow_page_mask to smaller functions.
mm/hugetlb: export hugetlb_e
The right interface to use to set a hugetlb pte entry is set_huge_pte_at. Use
that instead of set_pte_at.
Reviewed-by: Naoya Horiguchi
Signed-off-by: Aneesh Kumar K.V
---
mm/migrate.c | 21 +++--
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/mm/migrate.c b/mm/m
We will be using this later from the ppc64 code. Change the return type to bool.
Reviewed-by: Naoya Horiguchi
Signed-off-by: Aneesh Kumar K.V
---
include/linux/hugetlb.h | 1 +
mm/hugetlb.c| 8
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/linux/hug
From: Anshuman Khandual
ppc64 supports pgd hugetlb entries. Add code to handle hugetlb pgd entries to
follow_page_mask so that ppc64 can switch to it to handle hugetlbe entries.
Signed-off-by: Anshuman Khandual
Signed-off-by: Aneesh Kumar K.V
---
include/linux/hugetlb.h | 4
mm/gup.c
This enable to use the hugepd_t type early. No functional change in this patch.
Signed-off-by: Aneesh Kumar K.V
---
include/linux/hugetlb.h | 47 ---
1 file changed, 24 insertions(+), 23 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linu
Signed-off-by: Aneesh Kumar K.V
---
arch/powerpc/mm/hugetlbpage.c | 43 +++
1 file changed, 43 insertions(+)
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 80f6d2ed551a..5c829a83a4cc 100644
--- a/arch/powerpc/mm/hugetlbpag
Architectures like ppc64 supports hugepage size that is not mapped to any of
of the page table levels. Instead they add an alternate page table entry format
called hugepage directory (hugepd). hugepd indicates that the page table entry
maps
to a set of hugetlb pages. Add support for this in generi
Makes code reading easy. No functional changes in this patch. In a followup
patch, we will be updating the follow_page_mask to handle hugetlb hugepd format
so that archs like ppc64 can switch to the generic version. This split helps
in doing that nicely.
Reviewed-by: Naoya Horiguchi
Signed-off-by
Signed-off-by: Aneesh Kumar K.V
---
arch/powerpc/platforms/Kconfig.cputype | 5 +
1 file changed, 5 insertions(+)
diff --git a/arch/powerpc/platforms/Kconfig.cputype
b/arch/powerpc/platforms/Kconfig.cputype
index 8017542d..8acc4f27d101 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
With generic code now handling hugetlb entries at pgd level and also
supporting hugepage directory format, we can now remove the powerpc
sepcific follow_huge_addr implementation.
Signed-off-by: Aneesh Kumar K.V
---
arch/powerpc/mm/hugetlbpage.c | 64 ---
1
We use the kernel command line to do reservation of hugetlb pages. The code
duplcation here is mostly to make it simpler. With 64 bit book3s, we need to
support either 16G or 1G gigantic hugepage. Whereas the FSL_BOOK3E
implementation needs to support multiple gigantic hugepage. We avoid the
gpage_
Add __hard_irqs_disabled() similar to arch_irqs_disabled to check whether irqs
are hard disabled.
Signed-off-by: Aneesh Kumar K.V
---
arch/powerpc/include/asm/hw_irq.h | 7 +++
1 file changed, 7 insertions(+)
diff --git a/arch/powerpc/include/asm/hw_irq.h
b/arch/powerpc/include/asm/hw_irq.
Now that we made sure that lockless walk of linux page table is mostly limitted
to current task(current->mm->pgdir) we can update the THP update sequence to
only send IPI to cpus on which this task has run. This helps in reducing the IPI
overload on systems with large number of CPUs.
W.r.t kvm eve
No functional change. Add newer helpers with addtional warnings and use those.
---
arch/powerpc/include/asm/pgtable.h | 10 +
arch/powerpc/include/asm/pte-walk.h| 38 ++
arch/powerpc/kernel/eeh.c | 4 ++--
arch/powerpc/kernel/io-workaro
On 05/16/2017 02:47 PM, Aneesh Kumar K.V wrote:
> This moves the #ifdef in C code to a Kconfig dependency. Also we move the
> gigantic_page_supported() function to be arch specific. This gives arch to
> conditionally enable runtime allocation of gigantic huge page. Architectures
> like ppc64 suppor
On 05/16/2017 02:47 PM, Aneesh Kumar K.V wrote:
> POWER9 supports hugepages of size 2M and 1G in radix MMU mode. This patch
> enables the usage of 1G page size for hugetlbfs. This also update the helper
> such we can do 1G page allocation at runtime.
>
> We still don't enable 1G page size on DD1 v
The page table dump code doesn't know about huge pages, so currently
it crashes (or walks random memory, usually leading to a crash), if it
finds a huge page. On Book3S we only see huge pages in the Linux page
tables when we're using the P9 Radix MMU.
Teaching the code to properly handle huge page
Breno Leitao writes:
> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
>
> [23.138124] usercopy: kernel memory overwrite attempt detected to
> d3d80030 (mm_struct) (560 bytes)
> [23.13
Hi,
Today's mainline 4.12-rc1 fails to build for the attached configuration
file on Power7 box with below errors.
$ make
fs/built-in.o: In function `xfs_file_iomap_end':
fs/xfs/xfs_iomap.c:1152: undefined reference to `.put_dax'
fs/built-in.o: In function `xfs_file_iomap_begin':
fs/xfs/xfs_iomap.
[Cc'ing the relevant folks]
Breno Leitao writes:
> Hello,
>
> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> machine. Justing SSHing into the machine causes this issue.
>
> [23.138124] usercopy: kernel memory overwrite attempt detected to
> d3d80030 (mm_struct
On Tue, May 16, 2017 at 1:02 PM, Abdul Haleem
wrote:
> Hi,
>
> Today's mainline 4.12-rc1 fails to build for the attached configuration
> file on Power7 box with below errors.
>
> $ make
> fs/built-in.o: In function `xfs_file_iomap_end':
> fs/xfs/xfs_iomap.c:1152: undefined reference to `.put_dax'
On Tue, 2017-05-16 at 14:56 +0530, Aneesh Kumar K.V wrote:
>
> +static inline bool __hard_irqs_disabled(void)
> +{
> + unsigned long flags = mfmsr();
> + return (flags & MSR_EE) == 0;
> +}
> +
Reading the MSR has a cost. Can't we rely on paca->irq_happened being
non-0 ?
(If you are
On Tue, 2017-05-16 at 14:56 +0530, Aneesh Kumar K.V wrote:
> +static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea,
> + bool *is_thp, unsigned *hshift)
> +{
> + VM_WARN((!arch_irqs_disabled() && !__hard_irqs_disabled()) ,
> + "%s c
Hi all,
this series attempts to provide a "modern" timer interface where the
callback gets the timer_list structure as an argument so that it
can use container_of instead of having to cast to/from unsigned long
all the time (or even worse use function pointer casts, we have quite
a few of those as
And just move the dereferences inline, given that the timer gets
passed as an argument.
Signed-off-by: Christoph Hellwig
---
kernel/time/timer.c | 16 +---
1 file changed, 5 insertions(+), 11 deletions(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 152a706ef8b8..c79
KTHREAD_DELAYED_WORK_INIT and DEFINE_KTHREAD_DELAYED_WORK are unused
and are using a timer helper that's about to go away.
Signed-off-by: Christoph Hellwig
---
include/linux/kthread.h | 11 ---
1 file changed, 11 deletions(-)
diff --git a/include/linux/kthread.h b/include/linux/kthread.
The new callback gets a pointer to the timer_list itself, which can
then be used to get the containing structure using container_of
instead of casting from and to unsigned long all the time.
The setup helpers take a flags argument instead of needing countless
variants.
Note: this further reduces
Signed-off-by: Christoph Hellwig
---
include/linux/workqueue.h| 16 ++--
kernel/workqueue.c | 14 +++---
.../rcutorture/formal/srcu-cbmc/src/workqueues.h | 2 +-
3 files changed, 14 insertions(+), 1
Signed-off-by: Christoph Hellwig
---
arch/powerpc/mm/numa.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 371792e4418f..93a11227716b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1437,7 +1437,7
Signed-off-by: Christoph Hellwig
---
arch/s390/kernel/topology.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
index bb47c92476f0..4a0e867fca2b 100644
--- a/arch/s390/kernel/topology.c
+++ b/arch/s390/kernel/topol
Signed-off-by: Christoph Hellwig
---
arch/s390/kernel/lgr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/lgr.c b/arch/s390/kernel/lgr.c
index ae7dff110054..147124c05f28 100644
--- a/arch/s390/kernel/lgr.c
+++ b/arch/s390/kernel/lgr.c
@@ -153,14 +153,14
And remove a superflous double-initialization.
Signed-off-by: Christoph Hellwig
---
drivers/char/tlclk.c | 36 +++-
1 file changed, 19 insertions(+), 17 deletions(-)
diff --git a/drivers/char/tlclk.c b/drivers/char/tlclk.c
index 572a51704e67..7144016da82c 100644
Signed-off-by: Christoph Hellwig
---
include/linux/timer.h | 22 +++---
1 file changed, 3 insertions(+), 19 deletions(-)
diff --git a/include/linux/timer.h b/include/linux/timer.h
index 87afe52c8349..9c6694d3f66a 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -
On 2017/05/16 01:49PM, Balbir Singh wrote:
> arch_arm/disarm_probe use direct assignment for copying
> instructions, replace them with patch_instruction
Thanks for doing this!
We will also have to convert optprobes and ftrace to use
patch_instruction, but that can be done once the basic infrastr
On 2017/05/16 10:56AM, Anshuman Khandual wrote:
> On 05/16/2017 09:19 AM, Balbir Singh wrote:
> > patch_instruction is enhanced in this RFC to support
> > patching via a different virtual address (text_poke_area).
>
> Why writing instruction directly into the address is not
> sufficient and need t
On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman wrote:
> [Cc'ing the relevant folks]
>
> Breno Leitao writes:
>> Hello,
>>
>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>> machine. Justing SSHing into the machine causes this issue.
>>
>> [23.138124] usercopy: kerne
On 05/16/2017 07:32 AM, Kees Cook wrote:
> On Tue, May 16, 2017 at 4:09 AM, Michael Ellerman wrote:
>> [Cc'ing the relevant folks]
>>
>> Breno Leitao writes:
>>> Hello,
>>>
>>> Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
>>> machine. Justing SSHing into the machine causes t
On Tuesday 16 May 2017 03:52 PM, Anshuman Khandual wrote:
On 05/16/2017 02:47 PM, Aneesh Kumar K.V wrote:
This moves the #ifdef in C code to a Kconfig dependency. Also we move the
gigantic_page_supported() function to be arch specific. This gives arch to
conditionally enable runtime allocation
On Tue, May 16, 2017 at 1:48 PM, Christoph Hellwig wrote:
> Hi all,
>
> this series attempts to provide a "modern" timer interface where the
> callback gets the timer_list structure as an argument so that it
> can use container_of instead of having to cast to/from unsigned long
> all the time (or
On Tue, May 16, 2017 at 05:45:07PM +0200, Arnd Bergmann wrote:
> This looks really nice, but what is the long-term plan for the interface?
> Do you expect that we will eventually change all 700+ users of timer_list
> to the new type, or do we keep both variants around indefinitely to avoid
> having
On Tue, May 16, 2017 at 09:02:29PM +1000, Michael Ellerman wrote:
> Breno Leitao writes:
>
> > Hello,
> >
> > Kernel 4.12-rc1 is showing a bug when I try it on a POWER8 virtual
> > machine. Justing SSHing into the machine causes this issue.
> >
> > [23.138124] usercopy: kernel memory overwrit
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.
Adjust the system_state check in pas_cpufreq_cpu_exit() to handle the extra
states.
Signed-off-by: Thomas Gleixner
Acked-by: Viresh Kumar
Cc: "Rafae
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.
Adjust the system_state check in smp_generic_cpu_bootable() to handle the
extra states.
Signed-off-by: Thomas Gleixner
Acked-by: Michael Ellerman
Cc
On 05/16/17 04:48, Christoph Hellwig wrote:
> diff --git a/include/linux/timer.h b/include/linux/timer.h
> index e6789b8757d5..87afe52c8349 100644
> --- a/include/linux/timer.h
> +++ b/include/linux/timer.h
\
> @@ -126,6 +146,32 @@ static inline void init_timer_on_stack_key(struct
On Tue, May 16, 2017 at 1:48 PM, Christoph Hellwig wrote:
> Signed-off-by: Christoph Hellwig
> ---
> include/linux/timer.h | 22 +++---
> 1 file changed, 3 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/timer.h b/include/linux/timer.h
> index 87afe52c8349..9c6694d3
On Tue, May 16, 2017 at 1:48 PM, Christoph Hellwig wrote:
> unsigned long expires;
> - void(*function)(unsigned long);
> + union {
> + void(*func)(struct timer_list *timer);
> + void(*function)(u
Balbir Singh a écrit :
patch_instruction is enhanced in this RFC to support
patching via a different virtual address (text_poke_area).
The mapping of text_poke_area->addr is RW and not RWX.
This way the mapping allows write for patching and then we tear
down the mapping. The downside is that we
On Tue, May 16, 2017 at 5:51 PM, Christoph Hellwig wrote:
> On Tue, May 16, 2017 at 05:45:07PM +0200, Arnd Bergmann wrote:
>> This looks really nice, but what is the long-term plan for the interface?
>> Do you expect that we will eventually change all 700+ users of timer_list
>> to the new type, o
On Fri, 2017-05-12 at 13:37 -0400, David Miller wrote:
> > Right now it is larger, but what I suggested is to add a new optimized
> > routine just for this case, which would do STBI for 64-bytes but
> > without membar (do membar at the end of memmap_init_zone() and
> > deferred_init_memmap()
> >
>
On Mon, 15 May 2017 23:35:03 +0530
"Naveen N. Rao" wrote:
> diff --git a/arch/powerpc/include/asm/kprobes.h
> b/arch/powerpc/include/asm/kprobes.h
> index a83821f33ea3..b6960ef213ac 100644
> --- a/arch/powerpc/include/asm/kprobes.h
> +++ b/arch/powerpc/include/asm/kprobes.h
> @@ -61,6 +61,15 @@ e
On Mon, 15 May 2017 23:35:04 +0530
"Naveen N. Rao" wrote:
> Fix a circa 2005 FIXME by implementing a check to ensure that we
> actually got into the jprobe break handler() due to the trap in
> jprobe_return().
>
> Signed-off-by: Naveen N. Rao
> ---
> arch/powerpc/kernel/kprobes.c | 20
On Tue, 2017-05-16 at 19:11 +0530, Naveen N. Rao wrote:
> On 2017/05/16 10:56AM, Anshuman Khandual wrote:
> > On 05/16/2017 09:19 AM, Balbir Singh wrote:
> > > patch_instruction is enhanced in this RFC to support
> > > patching via a different virtual address (text_poke_area).
> >
> > Why writing
On Tue, 2017-05-16 at 19:05 +0530, Naveen N. Rao wrote:
> On 2017/05/16 01:49PM, Balbir Singh wrote:
> > arch_arm/disarm_probe use direct assignment for copying
> > instructions, replace them with patch_instruction
>
> Thanks for doing this!
>
> We will also have to convert optprobes and ftrace t
On Tue, 2017-05-16 at 22:20 +0200, LEROY Christophe wrote:
> Balbir Singh a écrit :
>
> > patch_instruction is enhanced in this RFC to support
> > patching via a different virtual address (text_poke_area).
> > The mapping of text_poke_area->addr is RW and not RWX.
> > This way the mapping allows
Benjamin Herrenschmidt writes:
> On Tue, 2017-05-16 at 14:56 +0530, Aneesh Kumar K.V wrote:
>> +static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea,
>> + bool *is_thp, unsigned *hshift)
>> +{
>> + VM_WARN((!arch_irqs_disabled() && !__hard_irq
This moves the #ifdef in C code to a Kconfig dependency. Also we move the
gigantic_page_supported() function to be arch specific. This gives arch to
conditionally enable runtime allocation of gigantic huge page. Architectures
like ppc64 supports different gigantic huge page size (16G and 1G) based
POWER9 supports hugepages of size 2M and 1G in radix MMU mode. This patch
enables the usage of 1G page size for hugetlbfs. This also update the helper
such we can do 1G page allocation at runtime.
We still don't enable 1G page size on DD1 version. This is to avoid doing
workaround mentioned in com
On Wed, 2017-05-17 at 08:57 +0530, Aneesh Kumar K.V wrote:
> Benjamin Herrenschmidt writes:
>
> > On Tue, 2017-05-16 at 14:56 +0530, Aneesh Kumar K.V wrote:
> > > +static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea,
> > > + bool *is_thp, unsigned
On 05/16/2017 02:54 PM, Aneesh Kumar K.V wrote:
> +void __init reserve_hugetlb_gpages(void)
> +{
> + char buf[10];
> + phys_addr_t base;
> + unsigned long gpage_size = 1UL << 34;
> + static __initdata char cmdline[COMMAND_LINE_SIZE];
> +
> + if (radix_enabled())
> +
On Wednesday 17 May 2017 10:27 AM, Benjamin Herrenschmidt wrote:
On Wed, 2017-05-17 at 08:57 +0530, Aneesh Kumar K.V wrote:
Benjamin Herrenschmidt writes:
On Tue, 2017-05-16 at 14:56 +0530, Aneesh Kumar K.V wrote:
+static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea,
+
On Wednesday 17 May 2017 10:31 AM, Anshuman Khandual wrote:
On 05/16/2017 02:54 PM, Aneesh Kumar K.V wrote:
+void __init reserve_hugetlb_gpages(void)
+{
+ char buf[10];
+ phys_addr_t base;
+ unsigned long gpage_size = 1UL << 34;
+ static __initdata char cmdline[COMMAND_
71 matches
Mail list logo