Re: [PATCH] powerpc/64: implement a slice mask cache

2016-07-24 Thread Nicholas Piggin
On Sat, 23 Jul 2016 20:36:42 +1000 Balbir Singh wrote: > On Sat, Jul 23, 2016 at 05:10:36PM +1000, Nicholas Piggin wrote: > > On Sat, 23 Jul 2016 12:19:37 +1000 > > Balbir Singh wrote: > > > > > On Fri, Jul 22, 2016 at 10:57:28PM +1000, Nicholas Piggin wrote: > > > > Calculating the slice m

Re: [PATCH] powerpc/64: implement a slice mask cache

2016-07-24 Thread Nicholas Piggin
On Sat, 23 Jul 2016 18:49:06 +1000 Benjamin Herrenschmidt wrote: > On Sat, 2016-07-23 at 17:10 +1000, Nicholas Piggin wrote: > > I wanted to avoid doing more work under slice_convert_lock, but > > we should just make that a per-mm lock anyway shouldn't we? > > Aren't the readers under the mm s

Re: [PATCH] powerpc/64: implement a slice mask cache

2016-07-23 Thread Balbir Singh
On Sat, Jul 23, 2016 at 05:10:36PM +1000, Nicholas Piggin wrote: > On Sat, 23 Jul 2016 12:19:37 +1000 > Balbir Singh wrote: > > > On Fri, Jul 22, 2016 at 10:57:28PM +1000, Nicholas Piggin wrote: > > > Calculating the slice mask can become a signifcant overhead for > > > get_unmapped_area. The mas

Re: [PATCH] powerpc/64: implement a slice mask cache

2016-07-23 Thread Benjamin Herrenschmidt
On Sat, 2016-07-23 at 17:10 +1000, Nicholas Piggin wrote: > I wanted to avoid doing more work under slice_convert_lock, but > we should just make that a per-mm lock anyway shouldn't we? Aren't the readers under the mm sem taken for writing or has this changed ? Cheers, Ben. _

Re: [PATCH] powerpc/64: implement a slice mask cache

2016-07-23 Thread Nicholas Piggin
On Sat, 23 Jul 2016 12:19:37 +1000 Balbir Singh wrote: > On Fri, Jul 22, 2016 at 10:57:28PM +1000, Nicholas Piggin wrote: > > Calculating the slice mask can become a signifcant overhead for > > get_unmapped_area. The mask is relatively small and does not change > > frequently, so we can cache it

Re: [PATCH] powerpc/64: implement a slice mask cache

2016-07-22 Thread Balbir Singh
On Fri, Jul 22, 2016 at 10:57:28PM +1000, Nicholas Piggin wrote: > Calculating the slice mask can become a signifcant overhead for > get_unmapped_area. The mask is relatively small and does not change > frequently, so we can cache it in the mm context. > > This saves about 30% kernel time on a 4K