Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-06-29 Thread Christopher Lameter
On Wed, 24 Jun 2020, Srikar Dronamraju wrote: > Currently Linux kernel with CONFIG_NUMA on a system with multiple > possible nodes, marks node 0 as online at boot. However in practice, > there are systems which have node 0 as memoryless and cpuless. Maybe add something to explain why you are not

Re: [PATCH v4 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-05-12 Thread Christopher Lameter
On Tue, 12 May 2020, Srikar Dronamraju wrote: > +#ifdef CONFIG_NUMA > + [N_ONLINE] = NODE_MASK_NONE, Again. Same issue as before. If you do this then you do a global change for all architectures. You need to put something in the early boot sequence (in a non architecture specific way) that se

Re: [PATCH v3 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-05-02 Thread Christopher Lameter
On Fri, 1 May 2020, Srikar Dronamraju wrote: > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -116,8 +116,10 @@ EXPORT_SYMBOL(latent_entropy); > */ > nodemask_t node_states[NR_NODE_STATES] __read_mostly = { > [N_POSSIBLE] = NODE_MASK_ALL, > +#ifdef CONFIG_NUMA > + [N_ONLINE] = NOD

Re: [PATCH v3 1/3] powerpc/numa: Set numa_node for all possible cpus

2020-05-02 Thread Christopher Lameter
On Fri, 1 May 2020, Srikar Dronamraju wrote: > - for_each_present_cpu(cpu) > - numa_setup_cpu(cpu); > + for_each_possible_cpu(cpu) { > + /* > + * Powerpc with CONFIG_NUMA always used to have a node 0, > + * even if it was memoryless or cpul

Re: [PATCH v2 3/4] mm: Implement reset_numa_mem

2020-03-18 Thread Christopher Lameter
On Wed, 18 Mar 2020, Srikar Dronamraju wrote: > For a memoryless or offline nodes, node_numa_mem refers to a N_MEMORY > fallback node. Currently kernel has an API set_numa_mem that sets > node_numa_mem for memoryless node. However this API cannot be used for > offline nodes. Hence all offline node

Re: [PATCH 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-03-18 Thread Christopher Lameter
On Mon, 16 Mar 2020, Michal Hocko wrote: > > We can dynamically number the nodes right? So just make sure that the > > firmware properly creates memory on node 0? > > Are you suggesting that the OS would renumber NUMA nodes coming > from FW just to satisfy node 0 existence? If yes then I believe t

Re: [PATCH 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-03-15 Thread Christopher Lameter
On Wed, 11 Mar 2020, Srikar Dronamraju wrote: > Currently Linux kernel with CONFIG_NUMA on a system with multiple > possible nodes, marks node 0 as online at boot. However in practice, > there are systems which have node 0 as memoryless and cpuless. Would it not be better and simpler to require

Re: [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9

2020-02-26 Thread Christopher Lameter
On Wed, 26 Feb 2020, Michal Hocko wrote: > Besides that kmalloc_node shouldn't really have an implicit GFP_THISNODE > semantic right? At least I do not see anything like that documented > anywhere. Kmalloc_node does not support memory policies etc. Only kmalloc does. kmalloc_node is mostly used b

Re: [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9

2020-02-26 Thread Christopher Lameter
On Mon, 24 Feb 2020, Michal Hocko wrote: > Hmm, nasty. Is there any reason why kmalloc_node behaves differently > from the page allocator? The page allocator will do the same thing if you pass GFP_THISNODE and insist on allocating memory from a node that does not exist. > > > A short summary. k

Re: [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9

2020-02-21 Thread Christopher Lameter
On Tue, 18 Feb 2020, Michal Hocko wrote: > Anyway, I do not think it is expected that kmalloc_node just blows up > on those nodes. The page allocator simply falls back to the closest > node. Something for kmalloc maintainers I believe. That is the case for an unconstrained allocation. kmalloc_nod

Re: [PATCH v5 02/11] powerpc/mm: Adds counting method to monitor lockless pgtable walks

2019-10-08 Thread Christopher Lameter
On Tue, 8 Oct 2019, Leonardo Bras wrote: > So you say that the performance impact of using my approach is the same > as using locks? (supposing that lock never waits) > > So, there are 'lockless pagetable walks' only for the sake of better > performance? I thought that was the major motivation

Re: [PATCH v5 02/11] powerpc/mm: Adds counting method to monitor lockless pgtable walks

2019-10-08 Thread Christopher Lameter
On Tue, 8 Oct 2019, Leonardo Bras wrote: > > You are creating contention on a single exclusive cacheline. Doesnt this > > defeat the whole purpose of the lockless page table walk? Use mmap_sem or > > so should cause the same performance regression? > > Sorry, I did not understand that question. >

Re: [PATCH v5 02/11] powerpc/mm: Adds counting method to monitor lockless pgtable walks

2019-10-08 Thread Christopher Lameter
On Wed, 2 Oct 2019, Leonardo Bras wrote: > + > +inline unsigned long __begin_lockless_pgtbl_walk(struct mm_struct *mm, > + bool disable_irq) > +{ > + unsigned long irq_mask = 0; > + > + if (IS_ENABLED(CONFIG_LOCKLESS_PAGE_TABLE_WALK_TRACKING)

Re: [PATCH 0/5] use pinned_vm instead of locked_vm to account pinned pages

2019-02-15 Thread Christopher Lameter
On Thu, 14 Feb 2019, Jason Gunthorpe wrote: > On Thu, Feb 14, 2019 at 01:46:51PM -0800, Ira Weiny wrote: > > > > > > Really unclear how to fix this. The pinned/locked split with two > > > > > buckets may be the right way. > > > > > > > > Are you suggesting that we have 2 user limits? > > > > > > T

Re: [PATCH 2/5] vfio/spapr_tce: use pinned_vm instead of locked_vm to account pinned pages

2019-02-12 Thread Christopher Lameter
On Tue, 12 Feb 2019, Alexey Kardashevskiy wrote: > Now it is 3 independent accesses (actually 4 but the last one is > diagnostic) with no locking around them. Why do not we need a lock > anymore precisely? Thanks, Updating a regular counter is racy and requires a lock. It was converted to be an a