Re: [v6 05/15] mm: don't accessed uninitialized struct pages

2017-08-17 Thread Pasha Tatashin
to use this iterator, which will simplify it. Pasha On 08/14/2017 09:51 AM, Pasha Tatashin wrote: mem_init() free_all_bootmem() free_low_memory_core_early() for_each_reserved_mem_region() reserve_bootmem_region() init_reserved_page() <- if this is deferr

Re: [v6 01/15] x86/mm: reserve only exiting low pages

2017-08-17 Thread Pasha Tatashin
Hi Michal, While working on a bug that was reported to me by "kernel test robot". unable to handle kernel NULL pointer dereference at (null) The issue was that page_to_pfn() on that configuration was looking for a section inside flags fields in "struct page". So, reserved but unava

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-26 Thread Pasha Tatashin
Hi Michal, I have considered your proposals: 1. Making memset(0) unconditional inside __init_single_page() is not going to work because it slows down SPARC, and ppc64. On SPARC even the BSTI optimization that I have proposed earlier won't work, because after consulting with other engineers I

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-30 Thread Pasha Tatashin
Could you be more specific? E.g. how are other stores done in __init_single_page safe then? I am sorry to be dense here but how does the full 64B store differ from other stores done in the same function. Hi Michal, It is safe to do regular 8-byte and smaller stores (stx, st, sth, stb) without

Re: [PATCH 1/2] Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved"

2018-08-27 Thread Pasha Tatashin
On Mon, Aug 27, 2018 at 8:31 AM Masayoshi Mizuma wrote: > > Hi Pavel, > > I would appreciate if you could send the feedback for the patch. I will study it today. Pavel > > Thanks! > Masa > > On 08/24/2018 04:29 AM, Michal Hocko wrote: > > On Fri 24-08-18 00:03:25, Naoya Horiguchi wrote: > >> (C

Re: [PATCH 1/2] Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved"

2018-08-27 Thread Pasha Tatashin
On 8/23/18 2:25 PM, Masayoshi Mizuma wrote: > From: Masayoshi Mizuma > > commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into > memblock.reserved") breaks movable_node kernel option because it > changed the memory gap range to reserved memblock. So, the node > is marked as Normal zone

Re: [PATCH 2/2] mm: zero remaining unavailable struct pages

2018-08-27 Thread Pasha Tatashin
On 8/23/18 2:25 PM, Masayoshi Mizuma wrote: > From: Naoya Horiguchi > > There is a kernel panic that is triggered when reading /proc/kpageflags > on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]': > > BUG: unable to handle kernel paging request at fffe > PGD 9b2

Re: [PATCH v4 3/4] mm/memory_hotplug: Define nodemask_t as a stack variable

2018-08-28 Thread Pasha Tatashin
On 8/17/18 5:00 AM, Oscar Salvador wrote: > From: Oscar Salvador > > Currently, unregister_mem_sect_under_nodes() tries to allocate a nodemask_t > in order to check whithin the loop which nodes have already been unlinked, > so we do not repeat the operation on them. > > NODEMASK_ALLOC calls km

Re: [PATCH] memory_hotplug: fix kernel_panic on offline page processing

2018-08-28 Thread Pasha Tatashin
On 8/28/18 7:25 AM, Michal Hocko wrote: > On Tue 28-08-18 11:05:39, Mikhail Zaslonko wrote: >> Within show_valid_zones() the function test_pages_in_a_zone() should be >> called for online memory blocks only. Otherwise it might lead to the >> VM_BUG_ON due to uninitialized struct pages (when CONFI

Re: [RFC v2 2/2] mm/memory_hotplug: Shrink spanned pages when offlining memory

2018-08-29 Thread Pasha Tatashin
On 8/17/18 11:41 AM, Oscar Salvador wrote: > From: Oscar Salvador > > Currently, we decrement zone/node spanned_pages when we > remove memory and not when we offline it. > > This, besides of not being consistent with the current code, > implies that we can access steal pages if we never get to

Re: [PATCH] mm/page_alloc: Clean up check_for_memory

2018-08-29 Thread Pasha Tatashin
On 8/28/18 5:01 PM, Oscar Salvador wrote: > From: Oscar Salvador > > check_for_memory looks a bit confusing. > First of all, we have this: > > if (N_MEMORY == N_NORMAL_MEMORY) > return; > > Checking the ENUM declaration, looks like N_MEMORY canot be equal to > N_NORMAL_MEMORY. > I could

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-11 Thread Pasha Tatashin
Have you measured that? I do not think it would be super hard to measure. I would be quite surprised if this added much if anything at all as the whole struct page should be in the cache line already. We do set reference count and other struct members. Almost nobody should be looking at our page

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-11 Thread Pasha Tatashin
do one membar call after all "struct pages" are initialized. I think what I sent out already is cleaner and better solution, because I am not sure what kind of performance we would see on other chips. On 05/11/2017 04:47 PM, Pasha Tatashin wrote: Have you measured that? I do no

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-12 Thread Pasha Tatashin
On 05/12/2017 12:57 PM, David Miller wrote: From: Pasha Tatashin Date: Thu, 11 May 2017 16:59:33 -0400 We should either keep memset() only for deferred struct pages as what I have in my patches. Another option is to add a new function struct_page_clear() which would default to memset() and

Re: [PATCH v2 3/4] mm/memory_hotplug: Make register_mem_sect_under_node a cb of walk_memory_range

2018-08-16 Thread Pasha Tatashin
On 18-06-22 13:18:38, osalva...@techadventures.net wrote: > From: Oscar Salvador > > link_mem_sections() and walk_memory_range() share most of the code, > so we can use convert link_mem_sections() into a dummy function that calls > walk_memory_range() with a callback to register_mem_sect_under_no

Re: [PATCH v3 1/4] mm/memory-hotplug: Drop unused args from remove_memory_section

2018-08-16 Thread Pasha Tatashin
On 18-08-15 16:42:16, Oscar Salvador wrote: > From: Oscar Salvador > > unregister_memory_section() calls remove_memory_section() > with three arguments: > > * node_id > * section > * phys_device > > Neither node_id nor phys_device are used. > Let us drop them from the function. Looks good: Rev

Re: [PATCH v3 2/4] mm/memory_hotplug: Drop mem_blk check from unregister_mem_sect_under_nodes

2018-08-16 Thread Pasha Tatashin
On 18-08-15 16:42:17, Oscar Salvador wrote: > From: Oscar Salvador > > Before calling to unregister_mem_sect_under_nodes(), > remove_memory_section() already checks if we got a valid memory_block. > > No need to check that again in unregister_mem_sect_under_nodes(). > > If more functions start

Re: [PATCH v3 3/4] mm/memory_hotplug: Refactor unregister_mem_sect_under_nodes

2018-08-16 Thread Pasha Tatashin
> > d) What's the maximum number of nodes, ever? Perhaps we can always > >fit a nodemask_t onto the stack, dunno. > > Right now, we define the maximum as NODES_SHIFT = 10, so: > > 1 << 10 = 1024 Maximum nodes. > > Since this makes only 128 bytes, I wonder if we can just go ahead and define

Re: [PATCH v3 4/4] mm/memory_hotplug: Drop node_online check in unregister_mem_sect_under_nodes

2018-08-16 Thread Pasha Tatashin
On 18-08-15 16:42:19, Oscar Salvador wrote: > From: Oscar Salvador > > We are getting the nid from the pages that are not yet removed, > but a node can only be offline when its memory/cpu's have been removed. > Therefore, we know that the node is still online. Reviewed-by: Pavel Tatashin > >

Re: [RESEND PATCH v10 0/6] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64

2018-08-16 Thread Pasha Tatashin
On 18-08-15 15:34:56, Andrew Morton wrote: > On Fri, 6 Jul 2018 17:01:09 +0800 Jia He wrote: > > > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > > where possible") optimized the loop in memmap_init_zone(). But it causes > > possible panic bug. So Daniel Vacek reverted

Re: [RESEND PATCH v10 2/6] mm: page_alloc: remain memblock_next_valid_pfn() on arm/arm64

2018-08-16 Thread Pasha Tatashin
On 18-07-06 17:01:11, Jia He wrote: > From: Jia He > > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > where possible") optimized the loop in memmap_init_zone(). But it causes > possible panic bug. So Daniel Vacek reverted it later. > > But as suggested by Daniel Vacek,

Re: [RESEND PATCH v10 3/6] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

2018-08-16 Thread Pasha Tatashin
> Signed-off-by: Jia He > --- > mm/memblock.c | 37 + > 1 file changed, 29 insertions(+), 8 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index ccad225..84f7fa7 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1140,31 +1140,52 @@ int _

Re: [RESEND PATCH v10 3/6] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

2018-08-16 Thread Pasha Tatashin
On 8/16/18 9:08 PM, Pavel Tatashin wrote: > >> Signed-off-by: Jia He >> --- >> mm/memblock.c | 37 + >> 1 file changed, 29 insertions(+), 8 deletions(-) >> >> diff --git a/mm/memblock.c b/mm/memblock.c >> index ccad225..84f7fa7 100644 >> --- a/mm/memblock.c

Re: [RESEND PATCH v10 6/6] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()

2018-08-16 Thread Pasha Tatashin
On 7/6/18 5:01 AM, Jia He wrote: > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > where possible") optimized the loop in memmap_init_zone(). But there is > still some room for improvement. E.g. in early_pfn_valid(), if pfn and > pfn+1 are in the same memblock region, we

Re: [mm PATCH v4 3/6] mm: Use memblock/zone specific iterator for handling deferred page init

2018-10-31 Thread Pasha Tatashin
On 10/17/18 7:54 PM, Alexander Duyck wrote: > This patch introduces a new iterator for_each_free_mem_pfn_range_in_zone. > > This iterator will take care of making sure a given memory range provided > is in fact contained within a zone. It takes are of all the bounds checking > we were doing in d

Re: [mm PATCH v4 3/6] mm: Use memblock/zone specific iterator for handling deferred page init

2018-10-31 Thread Pasha Tatashin
On 10/31/18 12:05 PM, Alexander Duyck wrote: > On Wed, 2018-10-31 at 15:40 +0000, Pasha Tatashin wrote: >> >> On 10/17/18 7:54 PM, Alexander Duyck wrote: >>> This patch introduces a new iterator for_each_free_mem_pfn_range_in_zone. >>> >>> This iter

[PATCH v2] vhost-vdpa: account iommu allocations

2023-12-26 Thread Pasha Tatashin
iommu allocations should be accounted in order to allow admins to monitor and limit the amount of iommu memory. Signed-off-by: Pasha Tatashin Acked-by: Michael S. Tsirkin --- drivers/vhost/vdpa.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Changelog: v1: This patch is spinned of

[PATCH] vhost-vdpa: account iommu allocations

2023-11-30 Thread Pasha Tatashin
iommu allocations should be accounted in order to allow admins to monitor and limit the amount of iommu memory. Signed-off-by: Pasha Tatashin --- drivers/vhost/vdpa.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) This patch is spinned of from the series: https://lore.kernel.org/all

Re: [PATCH v1] virtio_pmem: populate numa information

2022-10-26 Thread Pasha Tatashin
g the numa node is taken from cxl_pmem_region_probe in > drivers/cxl/pmem.c. > > Signed-off-by: Michael Sammler Enables the hot-plugging of virtio-pmem memory into correct memory nodes. Does not look like it effect the FS_DAX. Reviewed-by: Pasha Tatashin Thanks, Pasha > --- >

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-12 Thread Pasha Tatashin
On 9/12/18 10:27 AM, Gerald Schaefer wrote: > On Wed, 12 Sep 2018 15:39:33 +0200 > Michal Hocko wrote: > >> On Wed 12-09-18 15:03:56, Gerald Schaefer wrote: >> [...] >>> BTW, those sysfs attributes are world-readable, so anyone can trigger >>> the panic by simply reading them, or just run lsmem

Re: [PATCH 1/2] mm: Move page struct poisoning from CONFIG_DEBUG_VM to CONFIG_DEBUG_VM_PGFLAGS

2018-09-04 Thread Pasha Tatashin
Hi Alexander, This is a wrong way to do it. memblock_virt_alloc_try_nid_raw() does not initialize allocated memory, and by setting memory to all ones in debug build we ensure that no callers rely on this function to return zeroed memory just by accident. And, the accidents are frequent because mo

Re: [PATCH 1/2] mm: Move page struct poisoning from CONFIG_DEBUG_VM to CONFIG_DEBUG_VM_PGFLAGS

2018-09-04 Thread Pasha Tatashin
On 9/4/18 5:13 PM, Alexander Duyck wrote: > On Tue, Sep 4, 2018 at 1:07 PM Pasha Tatashin > wrote: >> >> Hi Alexander, >> >> This is a wrong way to do it. memblock_virt_alloc_try_nid_raw() does not >> initialize allocated memory, and by setting memory to

Re: Plumbers 2018 - Performance and Scalability Microconference

2018-09-05 Thread Pasha Tatashin
On 9/5/18 2:38 AM, Mike Rapoport wrote: > On Tue, Sep 04, 2018 at 05:28:13PM -0400, Daniel Jordan wrote: >> Pavel Tatashin, Ying Huang, and I are excited to be organizing a performance >> and scalability microconference this year at Plumbers[*], which is happening >> in Vancouver this year. Th

Re: [PATCH 2/2] mm: Create non-atomic version of SetPageReserved for init use

2018-09-05 Thread Pasha Tatashin
On 9/5/18 4:18 PM, Alexander Duyck wrote: > On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko wrote: >> >> On Tue 04-09-18 11:33:45, Alexander Duyck wrote: >>> From: Alexander Duyck >>> >>> It doesn't make much sense to use the atomic SetPageReserved at init time >>> when we are using memset to clea

Re: [PATCH v2 1/2] mm: Move page struct poisoning to CONFIG_DEBUG_VM_PAGE_INIT_POISON

2018-09-05 Thread Pasha Tatashin
On 9/5/18 5:13 PM, Alexander Duyck wrote: > From: Alexander Duyck > > On systems with a large amount of memory it can take a significant amount > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > value. I have seen it take over 2 minutes to initialize a system with >

Re: [PATCH v2 1/2] mm: Move page struct poisoning to CONFIG_DEBUG_VM_PAGE_INIT_POISON

2018-09-05 Thread Pasha Tatashin
On 9/5/18 5:29 PM, Alexander Duyck wrote: > On Wed, Sep 5, 2018 at 2:22 PM Pasha Tatashin > wrote: >> >> >> >> On 9/5/18 5:13 PM, Alexander Duyck wrote: >>> From: Alexander Duyck >>> >>> On systems with a large amount of memory it can t

Re: [PATCH v2 1/2] mm: Move page struct poisoning to CONFIG_DEBUG_VM_PAGE_INIT_POISON

2018-09-06 Thread Pasha Tatashin
On 9/6/18 11:41 AM, Alexander Duyck wrote: > On Thu, Sep 6, 2018 at 8:13 AM Michal Hocko wrote: >> >> On Thu 06-09-18 07:59:03, Dave Hansen wrote: >>> On 09/05/2018 10:47 PM, Michal Hocko wrote: why do you have to keep DEBUG_VM enabled for workloads where the boot time matters so much

Re: [PATCH v2 1/2] mm: Move page struct poisoning to CONFIG_DEBUG_VM_PAGE_INIT_POISON

2018-09-06 Thread Pasha Tatashin
On 9/6/18 1:03 PM, Michal Hocko wrote: > On Thu 06-09-18 08:41:52, Alexander Duyck wrote: >> On Thu, Sep 6, 2018 at 8:13 AM Michal Hocko wrote: >>> >>> On Thu 06-09-18 07:59:03, Dave Hansen wrote: On 09/05/2018 10:47 PM, Michal Hocko wrote: > why do you have to keep DEBUG_VM enabled for

Re: [PATCH 1/5] mm/memory_hotplug: Spare unnecessary calls to node_set_state

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > In node_states_check_changes_online, we check if the node will > have to be set for any of the N_*_MEMORY states after the pages > have been onlined. > > Later on, we perform the activation in node_states_set_node. > Currentl

Re: [PATCH 2/5] mm/memory_hotplug: Avoid node_set/clear_state(N_HIGH_MEMORY) when !CONFIG_HIGHMEM

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > Currently, when !CONFIG_HIGHMEM, status_change_nid_high is being set > to status_change_nid_normal, but on such systems N_HIGH_MEMORY falls > back to N_NORMAL_MEMORY. > That means that if status_change_nid_normal is not -1, >

Re: [PATCH 3/5] mm/memory_hotplug: Tidy up node_states_clear_node

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > node_states_clear has the following if statements: > > if ((N_MEMORY != N_NORMAL_MEMORY) && > (arg->status_change_nid_high >= 0)) > ... > > if ((N_MEMORY != N_HIGH_MEMORY) && > (arg->status_change_nid >= 0))

Re: [PATCH 4/5] mm/memory_hotplug: Simplify node_states_check_changes_online

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > While looking at node_states_check_changes_online, I stumbled > upon some confusing things. > > Right after entering the function, we find this: > > if (N_MEMORY == N_NORMAL_MEMORY) > zone_last = ZONE_MOVABLE; > > T

Re: [PATCH 5/5] mm/memory_hotplug: Clean up node_states_check_changes_offline

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > This patch, as the previous one, gets rid of the wrong if statements. > While at it, I realized that the comments are sometimes very confusing, > to say the least, and wrong. > For example: > > --- > zone_last = ZONE_MOVABLE; >

Re: [PATCH v2 3/4] mm/memory_hotplug: Simplify node_states_check_changes_online

2018-09-21 Thread Pasha Tatashin
On 9/21/18 9:26 AM, Oscar Salvador wrote: > From: Oscar Salvador > > While looking at node_states_check_changes_online, I stumbled > upon some confusing things. > > Right after entering the function, we find this: > > if (N_MEMORY == N_NORMAL_MEMORY) > zone_last = ZONE_MOVABLE; > > T

Re: [PATCH v4 3/5] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap

2018-09-21 Thread Pasha Tatashin
On 9/20/18 6:29 PM, Alexander Duyck wrote: > The ZONE_DEVICE pages were being initialized in two locations. One was with > the memory_hotplug lock held and another was outside of that lock. The > problem with this is that it was nearly doubling the memory initialization > time. Instead of doing t

Re: [PATCH] Revert "x86/tsc: Consolidate init code"

2018-09-10 Thread Pasha Tatashin
Hi Ville, The failure is surprising, because the commit is tiny, and almost does not change the code logic. From looking through the commit, the only functional difference this commit makes is: static_branch_enable(&__use_tsc) was called unconditionally from tsc_init(), but after the commit onl

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-10 Thread Pasha Tatashin
On 9/10/18 9:17 AM, Michal Hocko wrote: > [Cc Pavel] > > On Mon 10-09-18 14:35:27, Mikhail Zaslonko wrote: >> If memory end is not aligned with the linux memory section boundary, such >> a section is only partly initialized. This may lead to VM_BUG_ON due to >> uninitialized struct pages access

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-10 Thread Pasha Tatashin
Hi Michal, It is tricky, but probably can be done. Either change memmap_init_zone() or its caller to also cover the ends and starts of unaligned sections to initialize and reserve pages. The same thing would also need to be done in deferred_init_memmap() to cover the deferred init case. For hotp

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-10 Thread Pasha Tatashin
On Mon, Sep 10, 2018 at 10:19 AM Michal Hocko wrote: > > On Mon 10-09-18 14:11:45, Pavel Tatashin wrote: > > Hi Michal, > > > > It is tricky, but probably can be done. Either change > > memmap_init_zone() or its caller to also cover the ends and starts of > > unaligned sections to initialize and r

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-10 Thread Pasha Tatashin
On 9/10/18 10:41 AM, Michal Hocko wrote: > On Mon 10-09-18 14:32:16, Pavel Tatashin wrote: >> On Mon, Sep 10, 2018 at 10:19 AM Michal Hocko wrote: >>> >>> On Mon 10-09-18 14:11:45, Pavel Tatashin wrote: Hi Michal, It is tricky, but probably can be done. Either change memmap_i

Re: [PATCH v1 1/5] mm/memory_hotplug: drop intermediate __offline_pages

2018-08-30 Thread Pasha Tatashin
I guess the wrap was done because of __ref, but no reason to have this wrap. So looks good to me. Reviewed-by: Pavel Tatashin On 8/16/18 6:06 AM, David Hildenbrand wrote: > Let's avoid this indirection and just call the function offline_pages(). > > Signed-off-by: David Hildenbrand > --- > mm

Re: [PATCH v1 1/5] mm/memory_hotplug: drop intermediate __offline_pages

2018-08-30 Thread Pasha Tatashin
On 8/30/18 4:17 PM, Pasha Tatashin wrote: > I guess the wrap was done because of __ref, but no reason to have this > wrap. So looks good to me. > > Reviewed-by: Pavel Tatashin > > On 8/16/18 6:06 AM, David Hildenbrand wrote: >> Let's avoid this indirecti

Re: [PATCH v1 2/5] mm/memory_hotplug: enforce section alignment when onlining/offlining

2018-08-30 Thread Pasha Tatashin
Hi David, I am not sure this is needed, because we already have a stricter checker: check_hotplug_memory_range() You could call it from online_pages(), if you think there is a reason to do it, but other than that it is done from add_memory_resource() and from remove_memory(). Thank you, Pavel

Re: [PATCH v1 3/5] mm/memory_hotplug: check if sections are already online/offline

2018-08-30 Thread Pasha Tatashin
On 8/16/18 7:00 AM, David Hildenbrand wrote: > On 16.08.2018 12:47, Oscar Salvador wrote: >> On Thu, Aug 16, 2018 at 12:06:26PM +0200, David Hildenbrand wrote: >> >>> + >>> +/* check if all mem sections are offline */ >>> +bool mem_sections_offline(unsigned long pfn, unsigned long end_pfn) >>> +{ >

Re: [PATCH v1 4/5] mm/memory_hotplug: onlining pages can only fail due to notifiers

2018-08-30 Thread Pasha Tatashin
LGTM Reviewed-by: Pavel Tatashin On 8/16/18 6:06 AM, David Hildenbrand wrote: > Onlining pages can only fail if a notifier reported a problem (e.g. -ENOMEM). > online_pages_range() can never fail. > > Signed-off-by: David Hildenbrand > --- > mm/memory_hotplug.c | 9 ++--- > 1 file changed

Re: [PATCH v1 5/5] mm/memory_hotplug: print only with DEBUG_VM in online/offline_pages()

2018-08-30 Thread Pasha Tatashin
On 8/20/18 6:46 AM, David Hildenbrand wrote: > On 16.08.2018 12:06, David Hildenbrand wrote: >> Let's try to minimze the noise. >> >> Signed-off-by: David Hildenbrand >> --- >> mm/memory_hotplug.c | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_

Re: [PATCH] mm/page_alloc: Clean up check_for_memory

2018-08-31 Thread Pasha Tatashin
On 8/31/18 8:24 AM, Oscar Salvador wrote: > On Thu, Aug 30, 2018 at 01:55:29AM +0000, Pasha Tatashin wrote: >> I would re-write the above function like this: >> static void check_for_memory(pg_data_t *pgdat, int nid) >> { >> enum zone_type zone_type; >&

[PATCH] mm: Disable deferred struct page for 32-bit arches

2018-08-31 Thread Pasha Tatashin
Deferred struct page init is needed only on systems with large amount of physical memory to improve boot performance. 32-bit systems do not benefit from this feature. Jiri reported a problem where deferred struct pages do not work well with x86-32: [0.035162] Dentry cache hash table entries:

Re: [v1 0/9] Early boot time stamps for x86

2017-03-22 Thread Pasha Tatashin
Hi Peter, Thank you for looking at this patchset. Yes, I am certain it is 0 or near 0 on reset on this machine. Because, I actually wondered about it, and used stop watch as an alternative way to verify the result, twice. While, I suspect it is often the case that on reset tsc is 0, it is n

Re: [v1 0/9] Early boot time stamps for x86

2017-03-22 Thread Pasha Tatashin
On 2017-03-22 16:27, Peter Zijlstra wrote: On Wed, Mar 22, 2017 at 04:24:16PM -0400, Pavel Tatashin wrote: Last week I sent out patches to enable early boot time stamp on SPARC, that work is now under review: http://www.spinics.net/lists/sparclinux/msg17372.html This is continuation for that wo

Re: [v1 0/9] Early boot time stamps for x86

2017-03-23 Thread Pasha Tatashin
Hi Thomas, Thank you very much for looking at this patchset. Comments below: On 03/23/2017 06:56 AM, Thomas Gleixner wrote: On Wed, 22 Mar 2017, Pasha Tatashin wrote: Yes, I am certain it is 0 or near 0 on reset on this machine. Because, I Emphasis on "this machine' It's

Re: [v1 0/5] parallelized "struct page" zeroing

2017-03-23 Thread Pasha Tatashin
Hi Matthew, Thank you for your comment. If you look at the data, having memset() actually benefits initializing data. With base it takes: [ 66.148867] node 0 initialised, 128312523 pages in 7200ms With fix: [ 15.260634] node 0 initialised, 128312523 pages in 4190ms So 4.19s vs 7.2s for t

Re: [v1 0/5] parallelized "struct page" zeroing

2017-03-23 Thread Pasha Tatashin
On 03/23/2017 07:35 PM, David Miller wrote: From: Matthew Wilcox Date: Thu, 23 Mar 2017 16:26:38 -0700 On Thu, Mar 23, 2017 at 07:01:48PM -0400, Pavel Tatashin wrote: When deferred struct page initialization feature is enabled, we get a performance gain of initializing vmemmap in parallel a

Re: [v1 0/5] parallelized "struct page" zeroing

2017-03-23 Thread Pasha Tatashin
On 03/23/2017 07:47 PM, Pasha Tatashin wrote: How long does it take if we just don't zero this memory at all? We seem to be initialising most of struct page in __init_single_page(), so it seems like a lot of additional complexity to conditionally zero the rest of struct page. Alternat

Re: [v2 0/9] Early boot time stamps for x86

2017-03-25 Thread Pasha Tatashin
Hi Thomas, The second versions was actually meant as a reply to your e-mail: the code differences were minimal: the main differences were in the cover letter. You mentioned that it is not necessary to have early boot time stamps, and I wanted to show examples how this data is useful to track

Re: [v2 0/9] Early boot time stamps for x86

2017-03-25 Thread Pasha Tatashin
Hi Thomas, Thank you very much for a very insightful feedback. I will address your comments, and if I have any questions, I will ask them before sending out the next patchset. A few replies below: First of all, this "solution" is only valid for a very restricted set of systems and breaks ot

Re: linux-next: manual merge of the akpm-current tree with the sparc tree

2017-04-11 Thread Pasha Tatashin
Hi David and Andrew, Should I send this change to sparc ml for discussion? Thank you, Pasha On 03/29/2017 01:43 AM, David Miller wrote: From: Stephen Rothwell Date: Wed, 29 Mar 2017 16:37:46 +1100 3f506bf2a354 ("sparc64: NG4 memset 32 bits overflow") Andrew, this change still needs dis

Re: [PATCH] pid: kill pidhash_size in pidhash_init

2017-07-18 Thread Pasha Tatashin
Reviewed-by: Pavel Tatashin On 07/18/2017 10:47 AM, Kefeng Wang wrote: After commit 3d375d78593c ("mm: update callers to use HASH_ZERO flag"), drop unused pidhash_size in pidhash_init(). Signed-off-by: Kefeng Wang --- kernel/pid.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/kerne

Re: [v6 00/15] complete deferred page initialization

2017-08-11 Thread Pasha Tatashin
On 08/11/2017 03:58 AM, Michal Hocko wrote: [I am sorry I didn't get to your previous versions] Thank you for reviewing this work. I will address your comments, and send-out a new patches. In this work we do the following: - Never read access struct page until it was initialized How is t

Re: [v6 01/15] x86/mm: reserve only exiting low pages

2017-08-11 Thread Pasha Tatashin
Struct pages are initialized by going through __init_single_page(). Since the existing physical memory in memblock is represented in memblock.memory list, struct page for every page from this list goes through __init_single_page(). By a page _from_ this list you mean struct pages backing the phy

Re: [v6 02/15] x86/mm: setting fields in deferred pages

2017-08-11 Thread Pasha Tatashin
AFAIU register_page_bootmem_info_node is only about struct pages backing pgdat, usemap and memmap. Those should be in reserved memblocks and we do not initialize those at later times, they are not relevant to the deferred initialization as your changelog suggests so the ordering with get_page_boot

Re: [v6 04/15] mm: discard memblock data later

2017-08-11 Thread Pasha Tatashin
I guess this goes all the way down to Fixes: 7e18adb4f80b ("mm: meminit: initialise remaining struct pages in parallel with kswapd") I will add this to the patch. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Considering that

Re: [v6 05/15] mm: don't accessed uninitialized struct pages

2017-08-11 Thread Pasha Tatashin
On 08/11/2017 05:37 AM, Michal Hocko wrote: On Mon 07-08-17 16:38:39, Pavel Tatashin wrote: In deferred_init_memmap() where all deferred struct pages are initialized we have a check like this: if (page->flags) { VM_BUG_ON(page_zone(page) != zone); goto free_range;

Re: [v6 07/15] mm: defining memblock_virt_alloc_try_nid_raw

2017-08-11 Thread Pasha Tatashin
On 08/11/2017 08:39 AM, Michal Hocko wrote: On Mon 07-08-17 16:38:41, Pavel Tatashin wrote: A new variant of memblock_virt_alloc_* allocations: memblock_virt_alloc_try_nid_raw() - Does not zero the allocated memory - Does not panic if request cannot be satisfied OK, this looks good b

Re: [v6 08/15] mm: zero struct pages during initialization

2017-08-11 Thread Pasha Tatashin
I believe this deserves much more detailed explanation why this is safe. What actually prevents any pfn walker from seeing an uninitialized struct page? Please make your assumptions explicit in the commit log so that we can check them independently. There is nothing prevents pfn walkers from wal

Re: [v6 09/15] sparc64: optimized struct page zeroing

2017-08-11 Thread Pasha Tatashin
Add an optimized mm_zero_struct_page(), so struct page's are zeroed without calling memset(). We do eight to tent regular stores based on the size of struct page. Compiler optimizes out the conditions of switch() statement. Again, this doesn't explain why we need this. You have mentioned those r

Re: [v6 13/15] mm: stop zeroing memory during allocation in vmemmap

2017-08-11 Thread Pasha Tatashin
On 08/11/2017 09:04 AM, Michal Hocko wrote: On Mon 07-08-17 16:38:47, Pavel Tatashin wrote: Replace allocators in sprase-vmemmap to use the non-zeroing version. So, we will get the performance improvement by zeroing the memory in parallel when struct pages are zeroed. First of all this should

Re: [v6 14/15] mm: optimize early system hash allocations

2017-08-11 Thread Pasha Tatashin
Clients can call alloc_large_system_hash() with flag: HASH_ZERO to specify that memory that was allocated for system hash needs to be zeroed, otherwise the memory does not need to be zeroed, and client will initialize it. If memory does not need to be zero'd, call the new memblock_virt_alloc_raw(

Re: [v6 15/15] mm: debug for raw alloctor

2017-08-11 Thread Pasha Tatashin
When CONFIG_DEBUG_VM is enabled, this patch sets all the memory that is returned by memblock_virt_alloc_try_nid_raw() to ones to ensure that no places excpect zeroed memory. Please fold this into the patch which introduces memblock_virt_alloc_try_nid_raw. OK I am not sure CONFIG_DEBUG_VM is

Re: [v6 04/15] mm: discard memblock data later

2017-08-11 Thread Pasha Tatashin
I will address your comment, and send out a new patch. Should I send it out separately from the series or should I keep it inside? I would post it separatelly. It doesn't depend on the rest. OK, I will post it separately. No it does not depend on the rest, but the reset depends on this. So, I

Re: [v6 07/15] mm: defining memblock_virt_alloc_try_nid_raw

2017-08-11 Thread Pasha Tatashin
Sure, I could do this, but as I understood from earlier Dave Miller's comments, we should do one logical change at a time. Hence, introduce API in one patch use it in another. So, this is how I tried to organize this patch set. Is this assumption incorrect? Well, it really depends. If the patch

Re: [v6 04/15] mm: discard memblock data later

2017-08-11 Thread Pasha Tatashin
Hi Michal, This suggestion won't work, because there are arches without memblock support: tile, sh... So, I would still need to have: #ifdef CONFIG_MEMBLOCK in page_alloc, or define memblock_discard() stubs in nobootmem headfile. In either case it would become messier than what it is right

Re: [PATCH v7 07/11] sparc64: optimized struct page zeroing

2017-08-30 Thread Pasha Tatashin
Hi Dave, Thank you for acking. The reason I am not doing initializing stores is because they require a membar, even if only regular stores are following (I hoped to do a membar before first load). This is something I was thinking was not true, but after consulting with colleagues and checking

Re: [PATCH v6 4/4] x86/tsc: use tsc early

2017-08-30 Thread Pasha Tatashin
Hi Fenghua, Thank you for looking at this. Unfortunately I can't mark either of them __init because sched_clock_early() is called from u64 sched_clock_cpu(int cpu) Which is around for the live of the system. Thank you, Pasha On 08/30/2017 05:21 PM, Fenghua Yu wrote: On Wed, Aug 30,

Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early in boot

2017-08-28 Thread Pasha Tatashin
Hi Thomas, Thank you for your comments. My replies below. +/* + * Called once during to boot to initialize boot time. + */ +void read_boot_clock64(struct timespec64 *ts) And because its called only once, it does not need to be marked __init() and must be kept around forever, right? This is

Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early in boot

2017-08-28 Thread Pasha Tatashin
And because its called only once, it does not need to be marked __init() and must be kept around forever, right? This is because every other architecture implements read_boot_clock64() without __init: arm, s390. Beside, the original weak stub does not have __init macro. So, I can certainly try t

Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early in boot

2017-08-28 Thread Pasha Tatashin
void __init timekeeping_init(void) { /* * We must determine boot timestamp before getting current * persistent clock value, because implementation of * read_boot_clock64() might also call the persistent * clock, and a leap second may occur.

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-09 Thread Pasha Tatashin
Hi Michal, I like the idea of postponing the zeroing from the allocation to the init time. To be honest the improvement looks much larger than I would expect (Btw. this should be a part of the changelog rather than a outside link). The improvements are larger, because this time was never measu

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-10 Thread Pasha Tatashin
Well, I didn't object to this particular part. I was mostly concerned about http://lkml.kernel.org/r/1494003796-748672-4-git-send-email-pasha.tatas...@oracle.com and the "zero" argument for other functions. I guess we can do without that. I _think_ that we should simply _always_ initialize the pa

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-10 Thread Pasha Tatashin
On 05/10/2017 10:57 AM, Michal Hocko wrote: On Wed 10-05-17 09:42:22, Pasha Tatashin wrote: Well, I didn't object to this particular part. I was mostly concerned about http://lkml.kernel.org/r/1494003796-748672-4-git-send-email-pasha.tatas...@oracle.com and the "zero" arg

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-31 Thread Pasha Tatashin
OK, so why cannot we make zero_struct_page 8x 8B stores, other arches would do memset. You said it would be slower but would that be measurable? I am sorry to be so persistent here but I would be really happier if this didn't depend on the deferred initialization. If this is absolutely a no-go the

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-15 Thread Pasha Tatashin
Hi Michal, After looking at your suggested memblock_virt_alloc_core() change again, I decided to keep what I have. I do not want to inline memblock_virt_alloc_internal(), because it is not a performance critical path, and by inlining it we will unnecessarily increase the text size on all plat

Re: [v3 9/9] s390: teach platforms not to zero struct pages memory

2017-05-15 Thread Pasha Tatashin
Hi Heiko, Thank you for looking at this patch. I am worried to make the proposed change, because, as I understand in this case we allocate memory not for "struct page"s but for table that hold them. So, we will change the behavior from the current one, where this table is allocated zeroed, but

Re: [v3 0/9] parallelized "struct page" zeroing

2017-05-15 Thread Pasha Tatashin
On 05/15/2017 03:38 PM, Michal Hocko wrote: On Mon 15-05-17 14:12:10, Pasha Tatashin wrote: Hi Michal, After looking at your suggested memblock_virt_alloc_core() change again, I decided to keep what I have. I do not want to inline memblock_virt_alloc_internal(), because it is not a

Re: [v3 9/9] s390: teach platforms not to zero struct pages memory

2017-05-15 Thread Pasha Tatashin
Ah OK, I will include the change. Thank you, Pasha On 05/15/2017 07:17 PM, Heiko Carstens wrote: Hello Pasha, Thank you for looking at this patch. I am worried to make the proposed change, because, as I understand in this case we allocate memory not for "struct page"s but for table that hold

Re: Widespread crashes in -next, bisected to 'mm: drop HASH_ADAPT'

2017-05-20 Thread Pasha Tatashin
The problem is due to 32-bit integer overflow in: ADAPT_SCALE_BASE and adapt In dcache_init_early() that is causing the problem. It was not enabled before 'mm: drop HASH_ADAPT' but is enabled now, and it should follow right after: "PID hash table entries: 1024 (order: 0, 4096 bytes)" main()

Re: [v4 1/1] mm: Adaptive hash table scaling

2017-05-21 Thread Pasha Tatashin
Hi Andi, Thank you for looking at this. I mentioned earlier, I would not want to impose a cap. However, if you think that for example dcache needs a cap, there is already a mechanism for that via high_limit argument, so the client can be changed to provide that cap. However, this particular p

Re: [v4 1/1] mm: Adaptive hash table scaling

2017-05-22 Thread Pasha Tatashin
I have only noticed this email today because my incoming emails stopped syncing since Friday. But this is _definitely_ not the right approachh. 64G for 32b systems is _way_ off. We have only ~1G for the kernel. I've already proposed scaling up to 32M for 32b systems and Andi seems to be suggestin

Re: [v4 1/1] mm: Adaptive hash table scaling

2017-05-22 Thread Pasha Tatashin
This is just too ugly to live, really. If we do not need adaptive scaling then just make it #if __BITS_PER_LONG around the code. I would be fine with this. A big fat warning explaining why this is 64b only would be appropriate. OK, let me prettify it somehow, and I will send a new patch out.

Re: qemu sparc64 runtime crashes in -next

2017-06-14 Thread Pasha Tatashin
I think I know the problem, and working on a fix. Will send it out soon. Thank you, Pasha On 06/14/2017 04:42 PM, Guenter Roeck wrote: On Wed, Jun 14, 2017 at 03:31:08PM -0400, David Miller wrote: From: Guenter Roeck Date: Wed, 14 Jun 2017 03:13:54 -0700 Hi, my sparc qemu tests started fai

  1   2   >