Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-25 Thread Frederic Barrat
Le 25/08/2017 à 09:44, Benjamin Herrenschmidt a écrit : On Fri, 2017-08-25 at 06:53 +0200, Frederic Barrat wrote: Le 24/08/2017 à 20:47, Benjamin Herrenschmidt a écrit : On Thu, 2017-08-24 at 18:40 +0200, Frederic Barrat wrote: The decrementing part is giving me troubles, and I think it ma

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-25 Thread Benjamin Herrenschmidt
On Fri, 2017-08-25 at 06:53 +0200, Frederic Barrat wrote: > > Le 24/08/2017 à 20:47, Benjamin Herrenschmidt a écrit : > > On Thu, 2017-08-24 at 18:40 +0200, Frederic Barrat wrote: > > > > > > The decrementing part is giving me troubles, and I think it makes sense: > > > if I decrement the counter

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-24 Thread Frederic Barrat
Le 24/08/2017 à 20:47, Benjamin Herrenschmidt a écrit : On Thu, 2017-08-24 at 18:40 +0200, Frederic Barrat wrote: The decrementing part is giving me troubles, and I think it makes sense: if I decrement the counter when detaching the context from the capi card, then the next TLBIs for the memo

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-24 Thread Benjamin Herrenschmidt
On Thu, 2017-08-24 at 18:40 +0200, Frederic Barrat wrote: > > The decrementing part is giving me troubles, and I think it makes sense: > if I decrement the counter when detaching the context from the capi > card, then the next TLBIs for the memory context may be back to local. Yes, you need to

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-24 Thread Frederic Barrat
Le 21/08/2017 à 19:35, Benjamin Herrenschmidt a écrit : On Mon, 2017-08-21 at 19:27 +0200, Frederic Barrat wrote: Hi Ben, Le 24/07/2017 à 06:28, Benjamin Herrenschmidt a écrit : Instead of comparing the whole CPU mask every time, let's keep a counter of how many bits are set in the mask. Thu

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-22 Thread Benjamin Herrenschmidt
it easier. > Arguably what happens on those accelerators is pretty close to an active > cpu. > > Once it is merged, I'm going to have to backport your patch (and an > update to mine) to the p9-supporting distros. From a quick look, your > patch, i.e."[PATCH 5/6] pow

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-22 Thread Frederic Barrat
nce it is merged, I'm going to have to backport your patch (and an update to mine) to the p9-supporting distros. From a quick look, your patch, i.e."[PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's" is completely independent from the rest of the series, right? Fred

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-21 Thread Michael Ellerman
Frederic Barrat writes: > Hi Ben, > > Le 24/07/2017 à 06:28, Benjamin Herrenschmidt a écrit : >> Instead of comparing the whole CPU mask every time, let's >> keep a counter of how many bits are set in the mask. Thus >> testing for a local mm only requires testing if that counter >> is 1 and the c

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-21 Thread Benjamin Herrenschmidt
On Mon, 2017-08-21 at 19:27 +0200, Frederic Barrat wrote: > Hi Ben, > > Le 24/07/2017 à 06:28, Benjamin Herrenschmidt a écrit : > > Instead of comparing the whole CPU mask every time, let's > > keep a counter of how many bits are set in the mask. Thus > > testing for a local mm only requires testi

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-21 Thread Frederic Barrat
Hi Ben, Le 24/07/2017 à 06:28, Benjamin Herrenschmidt a écrit : Instead of comparing the whole CPU mask every time, let's keep a counter of how many bits are set in the mask. Thus testing for a local mm only requires testing if that counter is 1 and the current CPU bit is set in the mask. I'm

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-04 Thread Benjamin Herrenschmidt
On Fri, 2017-08-04 at 14:06 +0200, Frederic Barrat wrote: > > +#ifdef CONFIG_PPC_BOOK3S_64 > > +static inline int mm_is_thread_local(struct mm_struct *mm) > > +{ > > + if (atomic_read(&mm->context.active_cpus) > 1) > > + return false; > > + return cpumask_test_cpu(smp_processor_

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-08-04 Thread Frederic Barrat
Le 24/07/2017 à 06:28, Benjamin Herrenschmidt a écrit : Instead of comparing the whole CPU mask every time, let's keep a counter of how many bits are set in the mask. Thus testing for a local mm only requires testing if that counter is 1 and the current CPU bit is set in the mask. Signed-off-b

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-25 Thread Michael Ellerman
Nicholas Piggin writes: > On Mon, 24 Jul 2017 23:46:44 +1000 > Michael Ellerman wrote: > >> Nicholas Piggin writes: >> >> > On Mon, 24 Jul 2017 14:28:02 +1000 >> > Benjamin Herrenschmidt wrote: >> > >> >> Instead of comparing the whole CPU mask every time, let's >> >> keep a counter of how

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-25 Thread Nicholas Piggin
On Tue, 25 Jul 2017 11:03:45 +1000 Benjamin Herrenschmidt wrote: > On Tue, 2017-07-25 at 10:44 +1000, Nicholas Piggin wrote: > > The two variants are just cleaner versions of the two variants you > > already introduced. > > > > static inline bool mm_activate_cpu(struct mm_struct *mm) > > { > >

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-24 Thread Benjamin Herrenschmidt
On Tue, 2017-07-25 at 10:44 +1000, Nicholas Piggin wrote: > The two variants are just cleaner versions of the two variants you > already introduced. > > static inline bool mm_activate_cpu(struct mm_struct *mm) > { > if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next))) { > cpuma

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-24 Thread Nicholas Piggin
On Tue, 25 Jul 2017 06:58:46 +1000 Benjamin Herrenschmidt wrote: > On Mon, 2017-07-24 at 21:25 +1000, Nicholas Piggin wrote: > > > +#ifdef CONFIG_PPC_BOOK3S_64 > > > +static inline void inc_mm_active_cpus(struct mm_struct *mm) > > > +{ > > > + atomic_inc(&mm->context.active_cpus); > > > +} >

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-24 Thread Nicholas Piggin
On Mon, 24 Jul 2017 23:46:44 +1000 Michael Ellerman wrote: > Nicholas Piggin writes: > > > On Mon, 24 Jul 2017 14:28:02 +1000 > > Benjamin Herrenschmidt wrote: > > > >> Instead of comparing the whole CPU mask every time, let's > >> keep a counter of how many bits are set in the mask. Thus >

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-24 Thread Benjamin Herrenschmidt
On Mon, 2017-07-24 at 21:25 +1000, Nicholas Piggin wrote: > > +#ifdef CONFIG_PPC_BOOK3S_64 > > +static inline void inc_mm_active_cpus(struct mm_struct *mm) > > +{ > > + atomic_inc(&mm->context.active_cpus); > > +} > > +#else > > +static inline void inc_mm_active_cpus(struct mm_struct *mm) { } >

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-24 Thread Michael Ellerman
Nicholas Piggin writes: > On Mon, 24 Jul 2017 14:28:02 +1000 > Benjamin Herrenschmidt wrote: > >> Instead of comparing the whole CPU mask every time, let's >> keep a counter of how many bits are set in the mask. Thus >> testing for a local mm only requires testing if that counter >> is 1 and the

Re: [PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-24 Thread Nicholas Piggin
On Mon, 24 Jul 2017 14:28:02 +1000 Benjamin Herrenschmidt wrote: > Instead of comparing the whole CPU mask every time, let's > keep a counter of how many bits are set in the mask. Thus > testing for a local mm only requires testing if that counter > is 1 and the current CPU bit is set in the mask

[PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

2017-07-23 Thread Benjamin Herrenschmidt
Instead of comparing the whole CPU mask every time, let's keep a counter of how many bits are set in the mask. Thus testing for a local mm only requires testing if that counter is 1 and the current CPU bit is set in the mask. Signed-off-by: Benjamin Herrenschmidt --- arch/powerpc/include/asm/boo