[PATCH v2] vhost-vdpa: account iommu allocations

2023-12-26 Thread Pasha Tatashin
iommu allocations should be accounted in order to allow admins to monitor and limit the amount of iommu memory. Signed-off-by: Pasha Tatashin Acked-by: Michael S. Tsirkin --- drivers/vhost/vdpa.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Changelog: v1: This patch is spinned of

[PATCH] vhost-vdpa: account iommu allocations

2023-11-30 Thread Pasha Tatashin
iommu allocations should be accounted in order to allow admins to monitor and limit the amount of iommu memory. Signed-off-by: Pasha Tatashin --- drivers/vhost/vdpa.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) This patch is spinned of from the series: https://lore.kernel.org/all

Re: [PATCH v1] virtio_pmem: populate numa information

2022-10-26 Thread Pasha Tatashin
g the numa node is taken from cxl_pmem_region_probe in > drivers/cxl/pmem.c. > > Signed-off-by: Michael Sammler Enables the hot-plugging of virtio-pmem memory into correct memory nodes. Does not look like it effect the FS_DAX. Reviewed-by: Pasha Tatashin Thanks, Pasha > --- >

Re: [mm PATCH v4 3/6] mm: Use memblock/zone specific iterator for handling deferred page init

2018-10-31 Thread Pasha Tatashin
On 10/31/18 12:05 PM, Alexander Duyck wrote: > On Wed, 2018-10-31 at 15:40 +0000, Pasha Tatashin wrote: >> >> On 10/17/18 7:54 PM, Alexander Duyck wrote: >>> This patch introduces a new iterator for_each_free_mem_pfn_range_in_zone. >>> >>> This iter

Re: [mm PATCH v4 3/6] mm: Use memblock/zone specific iterator for handling deferred page init

2018-10-31 Thread Pasha Tatashin
On 10/17/18 7:54 PM, Alexander Duyck wrote: > This patch introduces a new iterator for_each_free_mem_pfn_range_in_zone. > > This iterator will take care of making sure a given memory range provided > is in fact contained within a zone. It takes are of all the bounds checking > we were doing in d

Re: [PATCH v4 3/5] mm: Defer ZONE_DEVICE page initialization to the point where we init pgmap

2018-09-21 Thread Pasha Tatashin
On 9/20/18 6:29 PM, Alexander Duyck wrote: > The ZONE_DEVICE pages were being initialized in two locations. One was with > the memory_hotplug lock held and another was outside of that lock. The > problem with this is that it was nearly doubling the memory initialization > time. Instead of doing t

Re: [PATCH v2 3/4] mm/memory_hotplug: Simplify node_states_check_changes_online

2018-09-21 Thread Pasha Tatashin
On 9/21/18 9:26 AM, Oscar Salvador wrote: > From: Oscar Salvador > > While looking at node_states_check_changes_online, I stumbled > upon some confusing things. > > Right after entering the function, we find this: > > if (N_MEMORY == N_NORMAL_MEMORY) > zone_last = ZONE_MOVABLE; > > T

Re: [PATCH 5/5] mm/memory_hotplug: Clean up node_states_check_changes_offline

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > This patch, as the previous one, gets rid of the wrong if statements. > While at it, I realized that the comments are sometimes very confusing, > to say the least, and wrong. > For example: > > --- > zone_last = ZONE_MOVABLE; >

Re: [PATCH 4/5] mm/memory_hotplug: Simplify node_states_check_changes_online

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > While looking at node_states_check_changes_online, I stumbled > upon some confusing things. > > Right after entering the function, we find this: > > if (N_MEMORY == N_NORMAL_MEMORY) > zone_last = ZONE_MOVABLE; > > T

Re: [PATCH 3/5] mm/memory_hotplug: Tidy up node_states_clear_node

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > node_states_clear has the following if statements: > > if ((N_MEMORY != N_NORMAL_MEMORY) && > (arg->status_change_nid_high >= 0)) > ... > > if ((N_MEMORY != N_HIGH_MEMORY) && > (arg->status_change_nid >= 0))

Re: [PATCH 2/5] mm/memory_hotplug: Avoid node_set/clear_state(N_HIGH_MEMORY) when !CONFIG_HIGHMEM

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > Currently, when !CONFIG_HIGHMEM, status_change_nid_high is being set > to status_change_nid_normal, but on such systems N_HIGH_MEMORY falls > back to N_NORMAL_MEMORY. > That means that if status_change_nid_normal is not -1, >

Re: [PATCH 1/5] mm/memory_hotplug: Spare unnecessary calls to node_set_state

2018-09-20 Thread Pasha Tatashin
On 9/19/18 6:08 AM, Oscar Salvador wrote: > From: Oscar Salvador > > In node_states_check_changes_online, we check if the node will > have to be set for any of the N_*_MEMORY states after the pages > have been onlined. > > Later on, we perform the activation in node_states_set_node. > Currentl

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-12 Thread Pasha Tatashin
On 9/12/18 10:27 AM, Gerald Schaefer wrote: > On Wed, 12 Sep 2018 15:39:33 +0200 > Michal Hocko wrote: > >> On Wed 12-09-18 15:03:56, Gerald Schaefer wrote: >> [...] >>> BTW, those sysfs attributes are world-readable, so anyone can trigger >>> the panic by simply reading them, or just run lsmem

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-10 Thread Pasha Tatashin
On 9/10/18 10:41 AM, Michal Hocko wrote: > On Mon 10-09-18 14:32:16, Pavel Tatashin wrote: >> On Mon, Sep 10, 2018 at 10:19 AM Michal Hocko wrote: >>> >>> On Mon 10-09-18 14:11:45, Pavel Tatashin wrote: Hi Michal, It is tricky, but probably can be done. Either change memmap_i

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-10 Thread Pasha Tatashin
On Mon, Sep 10, 2018 at 10:19 AM Michal Hocko wrote: > > On Mon 10-09-18 14:11:45, Pavel Tatashin wrote: > > Hi Michal, > > > > It is tricky, but probably can be done. Either change > > memmap_init_zone() or its caller to also cover the ends and starts of > > unaligned sections to initialize and r

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-10 Thread Pasha Tatashin
Hi Michal, It is tricky, but probably can be done. Either change memmap_init_zone() or its caller to also cover the ends and starts of unaligned sections to initialize and reserve pages. The same thing would also need to be done in deferred_init_memmap() to cover the deferred init case. For hotp

Re: [PATCH] memory_hotplug: fix the panic when memory end is not on the section boundary

2018-09-10 Thread Pasha Tatashin
On 9/10/18 9:17 AM, Michal Hocko wrote: > [Cc Pavel] > > On Mon 10-09-18 14:35:27, Mikhail Zaslonko wrote: >> If memory end is not aligned with the linux memory section boundary, such >> a section is only partly initialized. This may lead to VM_BUG_ON due to >> uninitialized struct pages access

Re: [PATCH] Revert "x86/tsc: Consolidate init code"

2018-09-10 Thread Pasha Tatashin
Hi Ville, The failure is surprising, because the commit is tiny, and almost does not change the code logic. From looking through the commit, the only functional difference this commit makes is: static_branch_enable(&__use_tsc) was called unconditionally from tsc_init(), but after the commit onl

Re: [PATCH v2 1/2] mm: Move page struct poisoning to CONFIG_DEBUG_VM_PAGE_INIT_POISON

2018-09-06 Thread Pasha Tatashin
On 9/6/18 1:03 PM, Michal Hocko wrote: > On Thu 06-09-18 08:41:52, Alexander Duyck wrote: >> On Thu, Sep 6, 2018 at 8:13 AM Michal Hocko wrote: >>> >>> On Thu 06-09-18 07:59:03, Dave Hansen wrote: On 09/05/2018 10:47 PM, Michal Hocko wrote: > why do you have to keep DEBUG_VM enabled for

Re: [PATCH v2 1/2] mm: Move page struct poisoning to CONFIG_DEBUG_VM_PAGE_INIT_POISON

2018-09-06 Thread Pasha Tatashin
On 9/6/18 11:41 AM, Alexander Duyck wrote: > On Thu, Sep 6, 2018 at 8:13 AM Michal Hocko wrote: >> >> On Thu 06-09-18 07:59:03, Dave Hansen wrote: >>> On 09/05/2018 10:47 PM, Michal Hocko wrote: why do you have to keep DEBUG_VM enabled for workloads where the boot time matters so much

Re: [PATCH v2 1/2] mm: Move page struct poisoning to CONFIG_DEBUG_VM_PAGE_INIT_POISON

2018-09-05 Thread Pasha Tatashin
On 9/5/18 5:29 PM, Alexander Duyck wrote: > On Wed, Sep 5, 2018 at 2:22 PM Pasha Tatashin > wrote: >> >> >> >> On 9/5/18 5:13 PM, Alexander Duyck wrote: >>> From: Alexander Duyck >>> >>> On systems with a large amount of memory it can t

Re: [PATCH v2 1/2] mm: Move page struct poisoning to CONFIG_DEBUG_VM_PAGE_INIT_POISON

2018-09-05 Thread Pasha Tatashin
On 9/5/18 5:13 PM, Alexander Duyck wrote: > From: Alexander Duyck > > On systems with a large amount of memory it can take a significant amount > of time to initialize all of the page structs with the PAGE_POISON_PATTERN > value. I have seen it take over 2 minutes to initialize a system with >

Re: [PATCH 2/2] mm: Create non-atomic version of SetPageReserved for init use

2018-09-05 Thread Pasha Tatashin
On 9/5/18 4:18 PM, Alexander Duyck wrote: > On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko wrote: >> >> On Tue 04-09-18 11:33:45, Alexander Duyck wrote: >>> From: Alexander Duyck >>> >>> It doesn't make much sense to use the atomic SetPageReserved at init time >>> when we are using memset to clea

Re: Plumbers 2018 - Performance and Scalability Microconference

2018-09-05 Thread Pasha Tatashin
On 9/5/18 2:38 AM, Mike Rapoport wrote: > On Tue, Sep 04, 2018 at 05:28:13PM -0400, Daniel Jordan wrote: >> Pavel Tatashin, Ying Huang, and I are excited to be organizing a performance >> and scalability microconference this year at Plumbers[*], which is happening >> in Vancouver this year. Th

Re: [PATCH 1/2] mm: Move page struct poisoning from CONFIG_DEBUG_VM to CONFIG_DEBUG_VM_PGFLAGS

2018-09-04 Thread Pasha Tatashin
On 9/4/18 5:13 PM, Alexander Duyck wrote: > On Tue, Sep 4, 2018 at 1:07 PM Pasha Tatashin > wrote: >> >> Hi Alexander, >> >> This is a wrong way to do it. memblock_virt_alloc_try_nid_raw() does not >> initialize allocated memory, and by setting memory to

Re: [PATCH 1/2] mm: Move page struct poisoning from CONFIG_DEBUG_VM to CONFIG_DEBUG_VM_PGFLAGS

2018-09-04 Thread Pasha Tatashin
Hi Alexander, This is a wrong way to do it. memblock_virt_alloc_try_nid_raw() does not initialize allocated memory, and by setting memory to all ones in debug build we ensure that no callers rely on this function to return zeroed memory just by accident. And, the accidents are frequent because mo

[PATCH] mm: Disable deferred struct page for 32-bit arches

2018-08-31 Thread Pasha Tatashin
Deferred struct page init is needed only on systems with large amount of physical memory to improve boot performance. 32-bit systems do not benefit from this feature. Jiri reported a problem where deferred struct pages do not work well with x86-32: [0.035162] Dentry cache hash table entries:

Re: [PATCH] mm/page_alloc: Clean up check_for_memory

2018-08-31 Thread Pasha Tatashin
On 8/31/18 8:24 AM, Oscar Salvador wrote: > On Thu, Aug 30, 2018 at 01:55:29AM +0000, Pasha Tatashin wrote: >> I would re-write the above function like this: >> static void check_for_memory(pg_data_t *pgdat, int nid) >> { >> enum zone_type zone_type; >&

Re: [PATCH v1 5/5] mm/memory_hotplug: print only with DEBUG_VM in online/offline_pages()

2018-08-30 Thread Pasha Tatashin
On 8/20/18 6:46 AM, David Hildenbrand wrote: > On 16.08.2018 12:06, David Hildenbrand wrote: >> Let's try to minimze the noise. >> >> Signed-off-by: David Hildenbrand >> --- >> mm/memory_hotplug.c | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_

Re: [PATCH v1 4/5] mm/memory_hotplug: onlining pages can only fail due to notifiers

2018-08-30 Thread Pasha Tatashin
LGTM Reviewed-by: Pavel Tatashin On 8/16/18 6:06 AM, David Hildenbrand wrote: > Onlining pages can only fail if a notifier reported a problem (e.g. -ENOMEM). > online_pages_range() can never fail. > > Signed-off-by: David Hildenbrand > --- > mm/memory_hotplug.c | 9 ++--- > 1 file changed

Re: [PATCH v1 3/5] mm/memory_hotplug: check if sections are already online/offline

2018-08-30 Thread Pasha Tatashin
On 8/16/18 7:00 AM, David Hildenbrand wrote: > On 16.08.2018 12:47, Oscar Salvador wrote: >> On Thu, Aug 16, 2018 at 12:06:26PM +0200, David Hildenbrand wrote: >> >>> + >>> +/* check if all mem sections are offline */ >>> +bool mem_sections_offline(unsigned long pfn, unsigned long end_pfn) >>> +{ >

Re: [PATCH v1 2/5] mm/memory_hotplug: enforce section alignment when onlining/offlining

2018-08-30 Thread Pasha Tatashin
Hi David, I am not sure this is needed, because we already have a stricter checker: check_hotplug_memory_range() You could call it from online_pages(), if you think there is a reason to do it, but other than that it is done from add_memory_resource() and from remove_memory(). Thank you, Pavel

Re: [PATCH v1 1/5] mm/memory_hotplug: drop intermediate __offline_pages

2018-08-30 Thread Pasha Tatashin
On 8/30/18 4:17 PM, Pasha Tatashin wrote: > I guess the wrap was done because of __ref, but no reason to have this > wrap. So looks good to me. > > Reviewed-by: Pavel Tatashin > > On 8/16/18 6:06 AM, David Hildenbrand wrote: >> Let's avoid this indirecti

Re: [PATCH v1 1/5] mm/memory_hotplug: drop intermediate __offline_pages

2018-08-30 Thread Pasha Tatashin
I guess the wrap was done because of __ref, but no reason to have this wrap. So looks good to me. Reviewed-by: Pavel Tatashin On 8/16/18 6:06 AM, David Hildenbrand wrote: > Let's avoid this indirection and just call the function offline_pages(). > > Signed-off-by: David Hildenbrand > --- > mm

Re: [PATCH] mm/page_alloc: Clean up check_for_memory

2018-08-29 Thread Pasha Tatashin
On 8/28/18 5:01 PM, Oscar Salvador wrote: > From: Oscar Salvador > > check_for_memory looks a bit confusing. > First of all, we have this: > > if (N_MEMORY == N_NORMAL_MEMORY) > return; > > Checking the ENUM declaration, looks like N_MEMORY canot be equal to > N_NORMAL_MEMORY. > I could

Re: [RFC v2 2/2] mm/memory_hotplug: Shrink spanned pages when offlining memory

2018-08-29 Thread Pasha Tatashin
On 8/17/18 11:41 AM, Oscar Salvador wrote: > From: Oscar Salvador > > Currently, we decrement zone/node spanned_pages when we > remove memory and not when we offline it. > > This, besides of not being consistent with the current code, > implies that we can access steal pages if we never get to

Re: [PATCH] memory_hotplug: fix kernel_panic on offline page processing

2018-08-28 Thread Pasha Tatashin
On 8/28/18 7:25 AM, Michal Hocko wrote: > On Tue 28-08-18 11:05:39, Mikhail Zaslonko wrote: >> Within show_valid_zones() the function test_pages_in_a_zone() should be >> called for online memory blocks only. Otherwise it might lead to the >> VM_BUG_ON due to uninitialized struct pages (when CONFI

Re: [PATCH v4 3/4] mm/memory_hotplug: Define nodemask_t as a stack variable

2018-08-28 Thread Pasha Tatashin
On 8/17/18 5:00 AM, Oscar Salvador wrote: > From: Oscar Salvador > > Currently, unregister_mem_sect_under_nodes() tries to allocate a nodemask_t > in order to check whithin the loop which nodes have already been unlinked, > so we do not repeat the operation on them. > > NODEMASK_ALLOC calls km

Re: [PATCH 2/2] mm: zero remaining unavailable struct pages

2018-08-27 Thread Pasha Tatashin
On 8/23/18 2:25 PM, Masayoshi Mizuma wrote: > From: Naoya Horiguchi > > There is a kernel panic that is triggered when reading /proc/kpageflags > on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]': > > BUG: unable to handle kernel paging request at fffe > PGD 9b2

Re: [PATCH 1/2] Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved"

2018-08-27 Thread Pasha Tatashin
On 8/23/18 2:25 PM, Masayoshi Mizuma wrote: > From: Masayoshi Mizuma > > commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into > memblock.reserved") breaks movable_node kernel option because it > changed the memory gap range to reserved memblock. So, the node > is marked as Normal zone

Re: [PATCH 1/2] Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved"

2018-08-27 Thread Pasha Tatashin
On Mon, Aug 27, 2018 at 8:31 AM Masayoshi Mizuma wrote: > > Hi Pavel, > > I would appreciate if you could send the feedback for the patch. I will study it today. Pavel > > Thanks! > Masa > > On 08/24/2018 04:29 AM, Michal Hocko wrote: > > On Fri 24-08-18 00:03:25, Naoya Horiguchi wrote: > >> (C

Re: [RESEND PATCH v10 6/6] mm: page_alloc: reduce unnecessary binary search in early_pfn_valid()

2018-08-16 Thread Pasha Tatashin
On 7/6/18 5:01 AM, Jia He wrote: > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > where possible") optimized the loop in memmap_init_zone(). But there is > still some room for improvement. E.g. in early_pfn_valid(), if pfn and > pfn+1 are in the same memblock region, we

Re: [RESEND PATCH v10 3/6] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

2018-08-16 Thread Pasha Tatashin
On 8/16/18 9:08 PM, Pavel Tatashin wrote: > >> Signed-off-by: Jia He >> --- >> mm/memblock.c | 37 + >> 1 file changed, 29 insertions(+), 8 deletions(-) >> >> diff --git a/mm/memblock.c b/mm/memblock.c >> index ccad225..84f7fa7 100644 >> --- a/mm/memblock.c

Re: [RESEND PATCH v10 3/6] mm: page_alloc: reduce unnecessary binary search in memblock_next_valid_pfn()

2018-08-16 Thread Pasha Tatashin
> Signed-off-by: Jia He > --- > mm/memblock.c | 37 + > 1 file changed, 29 insertions(+), 8 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index ccad225..84f7fa7 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1140,31 +1140,52 @@ int _

Re: [RESEND PATCH v10 2/6] mm: page_alloc: remain memblock_next_valid_pfn() on arm/arm64

2018-08-16 Thread Pasha Tatashin
On 18-07-06 17:01:11, Jia He wrote: > From: Jia He > > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > where possible") optimized the loop in memmap_init_zone(). But it causes > possible panic bug. So Daniel Vacek reverted it later. > > But as suggested by Daniel Vacek,

Re: [RESEND PATCH v10 0/6] optimize memblock_next_valid_pfn and early_pfn_valid on arm and arm64

2018-08-16 Thread Pasha Tatashin
On 18-08-15 15:34:56, Andrew Morton wrote: > On Fri, 6 Jul 2018 17:01:09 +0800 Jia He wrote: > > > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns > > where possible") optimized the loop in memmap_init_zone(). But it causes > > possible panic bug. So Daniel Vacek reverted

Re: [PATCH v3 4/4] mm/memory_hotplug: Drop node_online check in unregister_mem_sect_under_nodes

2018-08-16 Thread Pasha Tatashin
On 18-08-15 16:42:19, Oscar Salvador wrote: > From: Oscar Salvador > > We are getting the nid from the pages that are not yet removed, > but a node can only be offline when its memory/cpu's have been removed. > Therefore, we know that the node is still online. Reviewed-by: Pavel Tatashin > >

Re: [PATCH v3 3/4] mm/memory_hotplug: Refactor unregister_mem_sect_under_nodes

2018-08-16 Thread Pasha Tatashin
> > d) What's the maximum number of nodes, ever? Perhaps we can always > >fit a nodemask_t onto the stack, dunno. > > Right now, we define the maximum as NODES_SHIFT = 10, so: > > 1 << 10 = 1024 Maximum nodes. > > Since this makes only 128 bytes, I wonder if we can just go ahead and define

Re: [PATCH v3 2/4] mm/memory_hotplug: Drop mem_blk check from unregister_mem_sect_under_nodes

2018-08-16 Thread Pasha Tatashin
On 18-08-15 16:42:17, Oscar Salvador wrote: > From: Oscar Salvador > > Before calling to unregister_mem_sect_under_nodes(), > remove_memory_section() already checks if we got a valid memory_block. > > No need to check that again in unregister_mem_sect_under_nodes(). > > If more functions start

Re: [PATCH v3 1/4] mm/memory-hotplug: Drop unused args from remove_memory_section

2018-08-16 Thread Pasha Tatashin
On 18-08-15 16:42:16, Oscar Salvador wrote: > From: Oscar Salvador > > unregister_memory_section() calls remove_memory_section() > with three arguments: > > * node_id > * section > * phys_device > > Neither node_id nor phys_device are used. > Let us drop them from the function. Looks good: Rev

Re: [PATCH v2 3/4] mm/memory_hotplug: Make register_mem_sect_under_node a cb of walk_memory_range

2018-08-16 Thread Pasha Tatashin
On 18-06-22 13:18:38, osalva...@techadventures.net wrote: > From: Oscar Salvador > > link_mem_sections() and walk_memory_range() share most of the code, > so we can use convert link_mem_sections() into a dummy function that calls > walk_memory_range() with a callback to register_mem_sect_under_no

Re: [PATCH v10 05/10] mm: zero reserved and unavailable struct pages

2017-10-06 Thread Pasha Tatashin
Hi Michal, As I've said in other reply this should go in only if the scenario you describe is real. I am somehow suspicious to be honest. I simply do not see how those weird struct pages would be in a valid pfn range of any zone. There are examples of both when unavailable memory is not part

Re: [PATCH] mm: deferred_init_memmap improvements

2017-10-06 Thread Pasha Tatashin
Hi Anshuman, Thank you very much for looking at this. My reply below:: On 10/06/2017 02:48 AM, Anshuman Khandual wrote: On 10/04/2017 08:59 PM, Pavel Tatashin wrote: This patch fixes another existing issue on systems that have holes in zones i.e CONFIG_HOLES_IN_ZONE is defined. In for_each_me

Re: [PATCH v9 08/12] mm: zero reserved and unavailable struct pages

2017-10-04 Thread Pasha Tatashin
On 10/04/2017 10:04 AM, Michal Hocko wrote: On Wed 04-10-17 09:28:55, Pasha Tatashin wrote: I am not really familiar with the trim_low_memory_range code path. I am not even sure we have to care about it because nobody should be walking pfns outside of any zone. According to commit

Re: [PATCH v9 08/12] mm: zero reserved and unavailable struct pages

2017-10-04 Thread Pasha Tatashin
I am not really familiar with the trim_low_memory_range code path. I am not even sure we have to care about it because nobody should be walking pfns outside of any zone. According to commit comments first 4K belongs to BIOS, so I think the memory exists but BIOS may or may not report it to Li

Re: [PATCH v9 08/12] mm: zero reserved and unavailable struct pages

2017-10-04 Thread Pasha Tatashin
Could you be more specific where is such a memory reserved? I know of one example: trim_low_memory_range() unconditionally reserves from pfn 0, but e820__memblock_setup() might provide the exiting memory from pfn 1 (i.e. KVM). Then just initialize struct pages for that mapping rigth there whe

Re: [PATCH v9 06/12] mm: zero struct pages during initialization

2017-10-04 Thread Pasha Tatashin
On 10/04/2017 04:45 AM, Michal Hocko wrote: On Tue 03-10-17 11:22:35, Pasha Tatashin wrote: On 10/03/2017 09:08 AM, Michal Hocko wrote: On Wed 20-09-17 16:17:08, Pavel Tatashin wrote: Add struct page zeroing as a part of initialization of other fields in __init_single_page(). This single

Re: [PATCH v9 12/12] mm: stop zeroing memory during allocation in vmemmap

2017-10-03 Thread Pasha Tatashin
that are not zeroed properly. However, the first patch depends on mm: zero struct pages during initialization As it uses mm_zero_struct_page(). Pasha On 10/03/2017 11:34 AM, Pasha Tatashin wrote: On 10/03/2017 09:19 AM, Michal Hocko wrote: On Wed 20-09-17 16:17:14, Pavel Tatashin wrote

Re: [PATCH v9 03/12] mm: deferred_init_memmap improvements

2017-10-03 Thread Pasha Tatashin
-volatile counters, compiler will be smart enough to remove all the unnecessary de-references. As a plus, we won't be adding any new branches, and the code is still going to stay compact. Pasha On 10/03/2017 11:15 AM, Pasha Tatashin wrote: Hi Michal, Please be explicit that this is pos

Re: [PATCH v9 12/12] mm: stop zeroing memory during allocation in vmemmap

2017-10-03 Thread Pasha Tatashin
On 10/03/2017 09:19 AM, Michal Hocko wrote: On Wed 20-09-17 16:17:14, Pavel Tatashin wrote: vmemmap_alloc_block() will no longer zero the block, so zero memory at its call sites for everything except struct pages. Struct page memory is zero'd by struct page initialization. Replace allocators i

Re: [PATCH v9 08/12] mm: zero reserved and unavailable struct pages

2017-10-03 Thread Pasha Tatashin
On 10/03/2017 09:18 AM, Michal Hocko wrote: On Wed 20-09-17 16:17:10, Pavel Tatashin wrote: Some memory is reserved but unavailable: not present in memblock.memory (because not backed by physical pages), but present in memblock.reserved. Such memory has backing struct pages, but they are not ini

Re: [PATCH v9 06/12] mm: zero struct pages during initialization

2017-10-03 Thread Pasha Tatashin
On 10/03/2017 09:08 AM, Michal Hocko wrote: On Wed 20-09-17 16:17:08, Pavel Tatashin wrote: Add struct page zeroing as a part of initialization of other fields in __init_single_page(). This single thread performance collected on: Intel(R) Xeon(R) CPU E7-8895 v3 @ 2.60GHz with 1T of memory (26

Re: [PATCH v9 04/12] sparc64: simplify vmemmap_populate

2017-10-03 Thread Pasha Tatashin
Acked-by: Michal Hocko Thank you, Pasha

Re: [PATCH v9 03/12] mm: deferred_init_memmap improvements

2017-10-03 Thread Pasha Tatashin
Hi Michal, Please be explicit that this is possible only because we discard memblock data later after 3010f876500f ("mm: discard memblock data later"). Also be more explicit how the new code works. OK I like how the resulting code is more compact and smaller. That was the goal :) for_e

Re: [PATCH v9 02/12] sparc64/mm: setting fields in deferred pages

2017-10-03 Thread Pasha Tatashin
As you separated x86 and sparc patches doing essentially the same I assume David is going to take this patch? Correct, I noticed that usually platform specific changes are done in separate patches even if they are small. Dave already Acked this patch. So, I do not think it should be separated

Re: [PATCH v9 01/12] x86/mm: setting fields in deferred pages

2017-10-03 Thread Pasha Tatashin
Hi Michal, I hope I haven't missed anything but it looks good to me. Acked-by: Michal Hocko Thank you for your review. one nit below --- arch/x86/mm/init_64.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c in

Re: [PATCH v9 09/12] mm/kasan: kasan specific map populate function

2017-10-03 Thread Pasha Tatashin
Hi Mark, I considered using a new *populate() function for shadow without using vmemmap_populate(), but that makes things unnecessary complicated: vmemmap_populate() has builtin: 1. large page support 2. device memory support 3. node locality support 4. several config based variants on diffe

Re: [PATCH v6 1/4] sched/clock: interface to allow timestamps early in boot

2017-09-28 Thread Pasha Tatashin
It will be best if we can support TSC sync capability in x86, but seems is not easy. Sure, your hardware achieving sync would be best, but even if it does not, we can still use TSC. Using notsc simple because you fail to sync TSCs is quite crazy. The thing is, we need to support unsync'ed TSC i

Re: [PATCH v6 1/4] sched/clock: interface to allow timestamps early in boot

2017-09-27 Thread Pasha Tatashin
. IMO, the later is better, but either way works for me. Thank you, Pasha On 09/27/2017 09:52 AM, Dou Liyang wrote: Hi Pasha, Peter At 09/27/2017 09:16 PM, Pasha Tatashin wrote: Hi Peter, I am totally happy with removing notsc. This certainly simplifies the sched_clock code. Are there any issues

Re: [PATCH v6 1/4] sched/clock: interface to allow timestamps early in boot

2017-09-27 Thread Pasha Tatashin
Hi Russell, This might be so for ARM, and in fact if you look at my SPARC implementation, I simply made source clock initialize early, so regular sched_clock() is used. As on SPARC, we use either %tick or %stick registers with frequency determined via OpenFrimware. But, on x86 there are dozen

Re: [PATCH v6 1/4] sched/clock: interface to allow timestamps early in boot

2017-09-27 Thread Pasha Tatashin
Hi Peter, I am totally happy with removing notsc. This certainly simplifies the sched_clock code. Are there any issues with removing existing kernel parameters that I should be aware of? Thank you, Pasha On 09/27/2017 09:10 AM, Peter Zijlstra wrote: On Wed, Sep 27, 2017 at 02:58:57PM +0200,

Re: [PATCH v6 4/4] x86/tsc: use tsc early

2017-08-30 Thread Pasha Tatashin
Hi Fenghua, Thank you for looking at this. Unfortunately I can't mark either of them __init because sched_clock_early() is called from u64 sched_clock_cpu(int cpu) Which is around for the live of the system. Thank you, Pasha On 08/30/2017 05:21 PM, Fenghua Yu wrote: On Wed, Aug 30,

Re: [PATCH v7 07/11] sparc64: optimized struct page zeroing

2017-08-30 Thread Pasha Tatashin
Hi Dave, Thank you for acking. The reason I am not doing initializing stores is because they require a membar, even if only regular stores are following (I hoped to do a membar before first load). This is something I was thinking was not true, but after consulting with colleagues and checking

Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early in boot

2017-08-28 Thread Pasha Tatashin
void __init timekeeping_init(void) { /* * We must determine boot timestamp before getting current * persistent clock value, because implementation of * read_boot_clock64() might also call the persistent * clock, and a leap second may occur.

Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early in boot

2017-08-28 Thread Pasha Tatashin
And because its called only once, it does not need to be marked __init() and must be kept around forever, right? This is because every other architecture implements read_boot_clock64() without __init: arm, s390. Beside, the original weak stub does not have __init macro. So, I can certainly try t

Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early in boot

2017-08-28 Thread Pasha Tatashin
Hi Thomas, Thank you for your comments. My replies below. +/* + * Called once during to boot to initialize boot time. + */ +void read_boot_clock64(struct timespec64 *ts) And because its called only once, it does not need to be marked __init() and must be kept around forever, right? This is

Re: [v6 01/15] x86/mm: reserve only exiting low pages

2017-08-17 Thread Pasha Tatashin
Hi Michal, While working on a bug that was reported to me by "kernel test robot". unable to handle kernel NULL pointer dereference at (null) The issue was that page_to_pfn() on that configuration was looking for a section inside flags fields in "struct page". So, reserved but unava

Re: [v6 05/15] mm: don't accessed uninitialized struct pages

2017-08-17 Thread Pasha Tatashin
to use this iterator, which will simplify it. Pasha On 08/14/2017 09:51 AM, Pasha Tatashin wrote: mem_init() free_all_bootmem() free_low_memory_core_early() for_each_reserved_mem_region() reserve_bootmem_region() init_reserved_page() <- if this is deferr

Re: [v3 1/2] sched/clock: interface to allow timestamps early in boot

2017-08-14 Thread Pasha Tatashin
Hi Dou, Thank you for your comments: { x86_init.timers.timer_init(); tsc_init(); +tsc_early_fini(); tsc_early_fini() is defined in patch 2, I guess you may miss it when you split your patches. Indeed, I will move it to patch 2. +static DEFINE_STATIC_KEY_TRUE(__use_sched_clo

Re: [v6 15/15] mm: debug for raw alloctor

2017-08-14 Thread Pasha Tatashin
However, now thinking about it, I will change it to CONFIG_MEMBLOCK_DEBUG, and let users decide what other debugging configs need to be enabled, as this is also OK. Actually the more I think about it the more I am convinced that a kernel boot parameter would be better because it doesn't need the

Re: [v6 05/15] mm: don't accessed uninitialized struct pages

2017-08-14 Thread Pasha Tatashin
mem_init() free_all_bootmem() free_low_memory_core_early() for_each_reserved_mem_region() reserve_bootmem_region() init_reserved_page() <- if this is deferred reserved page __init_single_pfn() __init_single_page() So, currently, we are using the value of page->f

Re: [v6 04/15] mm: discard memblock data later

2017-08-14 Thread Pasha Tatashin
#ifdef CONFIG_MEMBLOCK in page_alloc, or define memblock_discard() stubs in nobootmem headfile. This is the standard way to do this. And it is usually preferred to proliferate ifdefs in the code. Hi Michal, As you suggested, I sent-out this patch separately. If you feel strongly, that this s

Re: [v6 04/15] mm: discard memblock data later

2017-08-14 Thread Pasha Tatashin
OK, I will post it separately. No it does not depend on the rest, but the reset depends on this. So, I am not sure how to enforce that this comes before the rest. Andrew will take care of that. Just make it explicit that some of the patch depends on an earlier work when reposting. Ok. Yes, t

Re: [v6 02/15] x86/mm: setting fields in deferred pages

2017-08-14 Thread Pasha Tatashin
On 08/14/2017 07:43 AM, Michal Hocko wrote: register_page_bootmem_info register_page_bootmem_info_node get_page_bootmem .. setting fields here .. such as: page->freelist = (void *)type; free_all_bootmem() free_low_memory_core_early() for_each_reserved_mem_region() reserve

Re: [v6 01/15] x86/mm: reserve only exiting low pages

2017-08-14 Thread Pasha Tatashin
Correct, the pgflags asserts were triggered when we were setting reserved flags to struct page for PFN 0 in which was never initialized through __init_single_page(). The reason they were triggered is because we set all uninitialized memory to ones in one of the debug patches. And why don't we ne

Re: [v6 04/15] mm: discard memblock data later

2017-08-11 Thread Pasha Tatashin
Hi Michal, This suggestion won't work, because there are arches without memblock support: tile, sh... So, I would still need to have: #ifdef CONFIG_MEMBLOCK in page_alloc, or define memblock_discard() stubs in nobootmem headfile. In either case it would become messier than what it is right

Re: [v6 07/15] mm: defining memblock_virt_alloc_try_nid_raw

2017-08-11 Thread Pasha Tatashin
Sure, I could do this, but as I understood from earlier Dave Miller's comments, we should do one logical change at a time. Hence, introduce API in one patch use it in another. So, this is how I tried to organize this patch set. Is this assumption incorrect? Well, it really depends. If the patch

Re: [v6 04/15] mm: discard memblock data later

2017-08-11 Thread Pasha Tatashin
I will address your comment, and send out a new patch. Should I send it out separately from the series or should I keep it inside? I would post it separatelly. It doesn't depend on the rest. OK, I will post it separately. No it does not depend on the rest, but the reset depends on this. So, I

Re: [v6 15/15] mm: debug for raw alloctor

2017-08-11 Thread Pasha Tatashin
When CONFIG_DEBUG_VM is enabled, this patch sets all the memory that is returned by memblock_virt_alloc_try_nid_raw() to ones to ensure that no places excpect zeroed memory. Please fold this into the patch which introduces memblock_virt_alloc_try_nid_raw. OK I am not sure CONFIG_DEBUG_VM is

Re: [v6 14/15] mm: optimize early system hash allocations

2017-08-11 Thread Pasha Tatashin
Clients can call alloc_large_system_hash() with flag: HASH_ZERO to specify that memory that was allocated for system hash needs to be zeroed, otherwise the memory does not need to be zeroed, and client will initialize it. If memory does not need to be zero'd, call the new memblock_virt_alloc_raw(

Re: [v6 13/15] mm: stop zeroing memory during allocation in vmemmap

2017-08-11 Thread Pasha Tatashin
On 08/11/2017 09:04 AM, Michal Hocko wrote: On Mon 07-08-17 16:38:47, Pavel Tatashin wrote: Replace allocators in sprase-vmemmap to use the non-zeroing version. So, we will get the performance improvement by zeroing the memory in parallel when struct pages are zeroed. First of all this should

Re: [v6 09/15] sparc64: optimized struct page zeroing

2017-08-11 Thread Pasha Tatashin
Add an optimized mm_zero_struct_page(), so struct page's are zeroed without calling memset(). We do eight to tent regular stores based on the size of struct page. Compiler optimizes out the conditions of switch() statement. Again, this doesn't explain why we need this. You have mentioned those r

Re: [v6 08/15] mm: zero struct pages during initialization

2017-08-11 Thread Pasha Tatashin
I believe this deserves much more detailed explanation why this is safe. What actually prevents any pfn walker from seeing an uninitialized struct page? Please make your assumptions explicit in the commit log so that we can check them independently. There is nothing prevents pfn walkers from wal

Re: [v6 07/15] mm: defining memblock_virt_alloc_try_nid_raw

2017-08-11 Thread Pasha Tatashin
On 08/11/2017 08:39 AM, Michal Hocko wrote: On Mon 07-08-17 16:38:41, Pavel Tatashin wrote: A new variant of memblock_virt_alloc_* allocations: memblock_virt_alloc_try_nid_raw() - Does not zero the allocated memory - Does not panic if request cannot be satisfied OK, this looks good b

Re: [v6 05/15] mm: don't accessed uninitialized struct pages

2017-08-11 Thread Pasha Tatashin
On 08/11/2017 05:37 AM, Michal Hocko wrote: On Mon 07-08-17 16:38:39, Pavel Tatashin wrote: In deferred_init_memmap() where all deferred struct pages are initialized we have a check like this: if (page->flags) { VM_BUG_ON(page_zone(page) != zone); goto free_range;

Re: [v6 04/15] mm: discard memblock data later

2017-08-11 Thread Pasha Tatashin
I guess this goes all the way down to Fixes: 7e18adb4f80b ("mm: meminit: initialise remaining struct pages in parallel with kswapd") I will add this to the patch. Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Considering that

Re: [v6 02/15] x86/mm: setting fields in deferred pages

2017-08-11 Thread Pasha Tatashin
AFAIU register_page_bootmem_info_node is only about struct pages backing pgdat, usemap and memmap. Those should be in reserved memblocks and we do not initialize those at later times, they are not relevant to the deferred initialization as your changelog suggests so the ordering with get_page_boot

Re: [v6 01/15] x86/mm: reserve only exiting low pages

2017-08-11 Thread Pasha Tatashin
Struct pages are initialized by going through __init_single_page(). Since the existing physical memory in memblock is represented in memblock.memory list, struct page for every page from this list goes through __init_single_page(). By a page _from_ this list you mean struct pages backing the phy

Re: [v6 00/15] complete deferred page initialization

2017-08-11 Thread Pasha Tatashin
On 08/11/2017 03:58 AM, Michal Hocko wrote: [I am sorry I didn't get to your previous versions] Thank you for reviewing this work. I will address your comments, and send-out a new patches. In this work we do the following: - Never read access struct page until it was initialized How is t

Re: [v6 11/15] arm64/kasan: explicitly zero kasan shadow memory

2017-08-08 Thread Pasha Tatashin
On 2017-08-08 09:15, David Laight wrote: From: Pasha Tatashin Sent: 08 August 2017 12:49 Thank you for looking at this change. What you described was in my previous iterations of this project. See for example here: https://lkml.org/lkml/2017/5/5/369 I was asked to remove that flag, and only

  1   2   >