iommu allocations should be accounted in order to allow admins to
monitor and limit the amount of iommu memory.
Signed-off-by: Pasha Tatashin
Acked-by: Michael S. Tsirkin
---
drivers/vhost/vdpa.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Changelog:
v1:
This patch is spinned of
iommu allocations should be accounted in order to allow admins to
monitor and limit the amount of iommu memory.
Signed-off-by: Pasha Tatashin
---
drivers/vhost/vdpa.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
This patch is spinned of from the series:
https://lore.kernel.org/all
g the numa node is taken from cxl_pmem_region_probe in
> drivers/cxl/pmem.c.
>
> Signed-off-by: Michael Sammler
Enables the hot-plugging of virtio-pmem memory into correct memory
nodes. Does not look like it effect the FS_DAX.
Reviewed-by: Pasha Tatashin
Thanks,
Pasha
> ---
>
On 10/31/18 12:05 PM, Alexander Duyck wrote:
> On Wed, 2018-10-31 at 15:40 +0000, Pasha Tatashin wrote:
>>
>> On 10/17/18 7:54 PM, Alexander Duyck wrote:
>>> This patch introduces a new iterator for_each_free_mem_pfn_range_in_zone.
>>>
>>> This iter
On 10/17/18 7:54 PM, Alexander Duyck wrote:
> This patch introduces a new iterator for_each_free_mem_pfn_range_in_zone.
>
> This iterator will take care of making sure a given memory range provided
> is in fact contained within a zone. It takes are of all the bounds checking
> we were doing in d
On 9/20/18 6:29 PM, Alexander Duyck wrote:
> The ZONE_DEVICE pages were being initialized in two locations. One was with
> the memory_hotplug lock held and another was outside of that lock. The
> problem with this is that it was nearly doubling the memory initialization
> time. Instead of doing t
On 9/21/18 9:26 AM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> While looking at node_states_check_changes_online, I stumbled
> upon some confusing things.
>
> Right after entering the function, we find this:
>
> if (N_MEMORY == N_NORMAL_MEMORY)
> zone_last = ZONE_MOVABLE;
>
> T
On 9/19/18 6:08 AM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> This patch, as the previous one, gets rid of the wrong if statements.
> While at it, I realized that the comments are sometimes very confusing,
> to say the least, and wrong.
> For example:
>
> ---
> zone_last = ZONE_MOVABLE;
>
On 9/19/18 6:08 AM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> While looking at node_states_check_changes_online, I stumbled
> upon some confusing things.
>
> Right after entering the function, we find this:
>
> if (N_MEMORY == N_NORMAL_MEMORY)
> zone_last = ZONE_MOVABLE;
>
> T
On 9/19/18 6:08 AM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> node_states_clear has the following if statements:
>
> if ((N_MEMORY != N_NORMAL_MEMORY) &&
> (arg->status_change_nid_high >= 0))
> ...
>
> if ((N_MEMORY != N_HIGH_MEMORY) &&
> (arg->status_change_nid >= 0))
On 9/19/18 6:08 AM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> Currently, when !CONFIG_HIGHMEM, status_change_nid_high is being set
> to status_change_nid_normal, but on such systems N_HIGH_MEMORY falls
> back to N_NORMAL_MEMORY.
> That means that if status_change_nid_normal is not -1,
>
On 9/19/18 6:08 AM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> In node_states_check_changes_online, we check if the node will
> have to be set for any of the N_*_MEMORY states after the pages
> have been onlined.
>
> Later on, we perform the activation in node_states_set_node.
> Currentl
On 9/12/18 10:27 AM, Gerald Schaefer wrote:
> On Wed, 12 Sep 2018 15:39:33 +0200
> Michal Hocko wrote:
>
>> On Wed 12-09-18 15:03:56, Gerald Schaefer wrote:
>> [...]
>>> BTW, those sysfs attributes are world-readable, so anyone can trigger
>>> the panic by simply reading them, or just run lsmem
On 9/10/18 10:41 AM, Michal Hocko wrote:
> On Mon 10-09-18 14:32:16, Pavel Tatashin wrote:
>> On Mon, Sep 10, 2018 at 10:19 AM Michal Hocko wrote:
>>>
>>> On Mon 10-09-18 14:11:45, Pavel Tatashin wrote:
Hi Michal,
It is tricky, but probably can be done. Either change
memmap_i
On Mon, Sep 10, 2018 at 10:19 AM Michal Hocko wrote:
>
> On Mon 10-09-18 14:11:45, Pavel Tatashin wrote:
> > Hi Michal,
> >
> > It is tricky, but probably can be done. Either change
> > memmap_init_zone() or its caller to also cover the ends and starts of
> > unaligned sections to initialize and r
Hi Michal,
It is tricky, but probably can be done. Either change
memmap_init_zone() or its caller to also cover the ends and starts of
unaligned sections to initialize and reserve pages.
The same thing would also need to be done in deferred_init_memmap() to
cover the deferred init case.
For hotp
On 9/10/18 9:17 AM, Michal Hocko wrote:
> [Cc Pavel]
>
> On Mon 10-09-18 14:35:27, Mikhail Zaslonko wrote:
>> If memory end is not aligned with the linux memory section boundary, such
>> a section is only partly initialized. This may lead to VM_BUG_ON due to
>> uninitialized struct pages access
Hi Ville,
The failure is surprising, because the commit is tiny, and almost does
not change the code logic.
From looking through the commit, the only functional difference this
commit makes is:
static_branch_enable(&__use_tsc) was called unconditionally from
tsc_init(), but after the commit onl
On 9/6/18 1:03 PM, Michal Hocko wrote:
> On Thu 06-09-18 08:41:52, Alexander Duyck wrote:
>> On Thu, Sep 6, 2018 at 8:13 AM Michal Hocko wrote:
>>>
>>> On Thu 06-09-18 07:59:03, Dave Hansen wrote:
On 09/05/2018 10:47 PM, Michal Hocko wrote:
> why do you have to keep DEBUG_VM enabled for
On 9/6/18 11:41 AM, Alexander Duyck wrote:
> On Thu, Sep 6, 2018 at 8:13 AM Michal Hocko wrote:
>>
>> On Thu 06-09-18 07:59:03, Dave Hansen wrote:
>>> On 09/05/2018 10:47 PM, Michal Hocko wrote:
why do you have to keep DEBUG_VM enabled for workloads where the boot
time matters so much
On 9/5/18 5:29 PM, Alexander Duyck wrote:
> On Wed, Sep 5, 2018 at 2:22 PM Pasha Tatashin
> wrote:
>>
>>
>>
>> On 9/5/18 5:13 PM, Alexander Duyck wrote:
>>> From: Alexander Duyck
>>>
>>> On systems with a large amount of memory it can t
On 9/5/18 5:13 PM, Alexander Duyck wrote:
> From: Alexander Duyck
>
> On systems with a large amount of memory it can take a significant amount
> of time to initialize all of the page structs with the PAGE_POISON_PATTERN
> value. I have seen it take over 2 minutes to initialize a system with
>
On 9/5/18 4:18 PM, Alexander Duyck wrote:
> On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko wrote:
>>
>> On Tue 04-09-18 11:33:45, Alexander Duyck wrote:
>>> From: Alexander Duyck
>>>
>>> It doesn't make much sense to use the atomic SetPageReserved at init time
>>> when we are using memset to clea
On 9/5/18 2:38 AM, Mike Rapoport wrote:
> On Tue, Sep 04, 2018 at 05:28:13PM -0400, Daniel Jordan wrote:
>> Pavel Tatashin, Ying Huang, and I are excited to be organizing a performance
>> and scalability microconference this year at Plumbers[*], which is happening
>> in Vancouver this year. Th
On 9/4/18 5:13 PM, Alexander Duyck wrote:
> On Tue, Sep 4, 2018 at 1:07 PM Pasha Tatashin
> wrote:
>>
>> Hi Alexander,
>>
>> This is a wrong way to do it. memblock_virt_alloc_try_nid_raw() does not
>> initialize allocated memory, and by setting memory to
Hi Alexander,
This is a wrong way to do it. memblock_virt_alloc_try_nid_raw() does not
initialize allocated memory, and by setting memory to all ones in debug
build we ensure that no callers rely on this function to return zeroed
memory just by accident.
And, the accidents are frequent because mo
Deferred struct page init is needed only on systems with large amount of
physical memory to improve boot performance. 32-bit systems do not benefit
from this feature.
Jiri reported a problem where deferred struct pages do not work well with
x86-32:
[0.035162] Dentry cache hash table entries:
On 8/31/18 8:24 AM, Oscar Salvador wrote:
> On Thu, Aug 30, 2018 at 01:55:29AM +0000, Pasha Tatashin wrote:
>> I would re-write the above function like this:
>> static void check_for_memory(pg_data_t *pgdat, int nid)
>> {
>> enum zone_type zone_type;
>&
On 8/20/18 6:46 AM, David Hildenbrand wrote:
> On 16.08.2018 12:06, David Hildenbrand wrote:
>> Let's try to minimze the noise.
>>
>> Signed-off-by: David Hildenbrand
>> ---
>> mm/memory_hotplug.c | 6 ++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_
LGTM
Reviewed-by: Pavel Tatashin
On 8/16/18 6:06 AM, David Hildenbrand wrote:
> Onlining pages can only fail if a notifier reported a problem (e.g. -ENOMEM).
> online_pages_range() can never fail.
>
> Signed-off-by: David Hildenbrand
> ---
> mm/memory_hotplug.c | 9 ++---
> 1 file changed
On 8/16/18 7:00 AM, David Hildenbrand wrote:
> On 16.08.2018 12:47, Oscar Salvador wrote:
>> On Thu, Aug 16, 2018 at 12:06:26PM +0200, David Hildenbrand wrote:
>>
>>> +
>>> +/* check if all mem sections are offline */
>>> +bool mem_sections_offline(unsigned long pfn, unsigned long end_pfn)
>>> +{
>
Hi David,
I am not sure this is needed, because we already have a stricter checker:
check_hotplug_memory_range()
You could call it from online_pages(), if you think there is a reason to
do it, but other than that it is done from add_memory_resource() and
from remove_memory().
Thank you,
Pavel
On 8/30/18 4:17 PM, Pasha Tatashin wrote:
> I guess the wrap was done because of __ref, but no reason to have this
> wrap. So looks good to me.
>
> Reviewed-by: Pavel Tatashin >
> On 8/16/18 6:06 AM, David Hildenbrand wrote:
>> Let's avoid this indirecti
I guess the wrap was done because of __ref, but no reason to have this
wrap. So looks good to me.
Reviewed-by: Pavel Tatashin
On 8/16/18 6:06 AM, David Hildenbrand wrote:
> Let's avoid this indirection and just call the function offline_pages().
>
> Signed-off-by: David Hildenbrand
> ---
> mm
On 8/28/18 5:01 PM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> check_for_memory looks a bit confusing.
> First of all, we have this:
>
> if (N_MEMORY == N_NORMAL_MEMORY)
> return;
>
> Checking the ENUM declaration, looks like N_MEMORY canot be equal to
> N_NORMAL_MEMORY.
> I could
On 8/17/18 11:41 AM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> Currently, we decrement zone/node spanned_pages when we
> remove memory and not when we offline it.
>
> This, besides of not being consistent with the current code,
> implies that we can access steal pages if we never get to
On 8/28/18 7:25 AM, Michal Hocko wrote:
> On Tue 28-08-18 11:05:39, Mikhail Zaslonko wrote:
>> Within show_valid_zones() the function test_pages_in_a_zone() should be
>> called for online memory blocks only. Otherwise it might lead to the
>> VM_BUG_ON due to uninitialized struct pages (when CONFI
On 8/17/18 5:00 AM, Oscar Salvador wrote:
> From: Oscar Salvador
>
> Currently, unregister_mem_sect_under_nodes() tries to allocate a nodemask_t
> in order to check whithin the loop which nodes have already been unlinked,
> so we do not repeat the operation on them.
>
> NODEMASK_ALLOC calls km
On 8/23/18 2:25 PM, Masayoshi Mizuma wrote:
> From: Naoya Horiguchi
>
> There is a kernel panic that is triggered when reading /proc/kpageflags
> on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]':
>
> BUG: unable to handle kernel paging request at fffe
> PGD 9b2
On 8/23/18 2:25 PM, Masayoshi Mizuma wrote:
> From: Masayoshi Mizuma
>
> commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
> memblock.reserved") breaks movable_node kernel option because it
> changed the memory gap range to reserved memblock. So, the node
> is marked as Normal zone
On Mon, Aug 27, 2018 at 8:31 AM Masayoshi Mizuma wrote:
>
> Hi Pavel,
>
> I would appreciate if you could send the feedback for the patch.
I will study it today.
Pavel
>
> Thanks!
> Masa
>
> On 08/24/2018 04:29 AM, Michal Hocko wrote:
> > On Fri 24-08-18 00:03:25, Naoya Horiguchi wrote:
> >> (C
On 7/6/18 5:01 AM, Jia He wrote:
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But there is
> still some room for improvement. E.g. in early_pfn_valid(), if pfn and
> pfn+1 are in the same memblock region, we
On 8/16/18 9:08 PM, Pavel Tatashin wrote:
>
>> Signed-off-by: Jia He
>> ---
>> mm/memblock.c | 37 +
>> 1 file changed, 29 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/memblock.c b/mm/memblock.c
>> index ccad225..84f7fa7 100644
>> --- a/mm/memblock.c
> Signed-off-by: Jia He
> ---
> mm/memblock.c | 37 +
> 1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/mm/memblock.c b/mm/memblock.c
> index ccad225..84f7fa7 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1140,31 +1140,52 @@ int _
On 18-07-06 17:01:11, Jia He wrote:
> From: Jia He
>
> Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> where possible") optimized the loop in memmap_init_zone(). But it causes
> possible panic bug. So Daniel Vacek reverted it later.
>
> But as suggested by Daniel Vacek,
On 18-08-15 15:34:56, Andrew Morton wrote:
> On Fri, 6 Jul 2018 17:01:09 +0800 Jia He wrote:
>
> > Commit b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns
> > where possible") optimized the loop in memmap_init_zone(). But it causes
> > possible panic bug. So Daniel Vacek reverted
On 18-08-15 16:42:19, Oscar Salvador wrote:
> From: Oscar Salvador
>
> We are getting the nid from the pages that are not yet removed,
> but a node can only be offline when its memory/cpu's have been removed.
> Therefore, we know that the node is still online.
Reviewed-by: Pavel Tatashin
>
>
> > d) What's the maximum number of nodes, ever? Perhaps we can always
> >fit a nodemask_t onto the stack, dunno.
>
> Right now, we define the maximum as NODES_SHIFT = 10, so:
>
> 1 << 10 = 1024 Maximum nodes.
>
> Since this makes only 128 bytes, I wonder if we can just go ahead and define
On 18-08-15 16:42:17, Oscar Salvador wrote:
> From: Oscar Salvador
>
> Before calling to unregister_mem_sect_under_nodes(),
> remove_memory_section() already checks if we got a valid memory_block.
>
> No need to check that again in unregister_mem_sect_under_nodes().
>
> If more functions start
On 18-08-15 16:42:16, Oscar Salvador wrote:
> From: Oscar Salvador
>
> unregister_memory_section() calls remove_memory_section()
> with three arguments:
>
> * node_id
> * section
> * phys_device
>
> Neither node_id nor phys_device are used.
> Let us drop them from the function.
Looks good:
Rev
On 18-06-22 13:18:38, osalva...@techadventures.net wrote:
> From: Oscar Salvador
>
> link_mem_sections() and walk_memory_range() share most of the code,
> so we can use convert link_mem_sections() into a dummy function that calls
> walk_memory_range() with a callback to register_mem_sect_under_no
Hi Michal,
As I've said in other reply this should go in only if the scenario you
describe is real. I am somehow suspicious to be honest. I simply do not
see how those weird struct pages would be in a valid pfn range of any
zone.
There are examples of both when unavailable memory is not part
Hi Anshuman,
Thank you very much for looking at this. My reply below::
On 10/06/2017 02:48 AM, Anshuman Khandual wrote:
On 10/04/2017 08:59 PM, Pavel Tatashin wrote:
This patch fixes another existing issue on systems that have holes in
zones i.e CONFIG_HOLES_IN_ZONE is defined.
In for_each_me
On 10/04/2017 10:04 AM, Michal Hocko wrote:
On Wed 04-10-17 09:28:55, Pasha Tatashin wrote:
I am not really familiar with the trim_low_memory_range code path. I am
not even sure we have to care about it because nobody should be walking
pfns outside of any zone.
According to commit
I am not really familiar with the trim_low_memory_range code path. I am
not even sure we have to care about it because nobody should be walking
pfns outside of any zone.
According to commit comments first 4K belongs to BIOS, so I think the
memory exists but BIOS may or may not report it to Li
Could you be more specific where is such a memory reserved?
I know of one example: trim_low_memory_range() unconditionally reserves from
pfn 0, but e820__memblock_setup() might provide the exiting memory from pfn
1 (i.e. KVM).
Then just initialize struct pages for that mapping rigth there whe
On 10/04/2017 04:45 AM, Michal Hocko wrote:
On Tue 03-10-17 11:22:35, Pasha Tatashin wrote:
On 10/03/2017 09:08 AM, Michal Hocko wrote:
On Wed 20-09-17 16:17:08, Pavel Tatashin wrote:
Add struct page zeroing as a part of initialization of other fields in
__init_single_page().
This single
that are not zeroed properly.
However, the first patch depends on
mm: zero struct pages during initialization
As it uses mm_zero_struct_page().
Pasha
On 10/03/2017 11:34 AM, Pasha Tatashin wrote:
On 10/03/2017 09:19 AM, Michal Hocko wrote:
On Wed 20-09-17 16:17:14, Pavel Tatashin wrote
-volatile counters, compiler
will be smart enough to remove all the unnecessary de-references. As a
plus, we won't be adding any new branches, and the code is still going
to stay compact.
Pasha
On 10/03/2017 11:15 AM, Pasha Tatashin wrote:
Hi Michal,
Please be explicit that this is pos
On 10/03/2017 09:19 AM, Michal Hocko wrote:
On Wed 20-09-17 16:17:14, Pavel Tatashin wrote:
vmemmap_alloc_block() will no longer zero the block, so zero memory
at its call sites for everything except struct pages. Struct page memory
is zero'd by struct page initialization.
Replace allocators i
On 10/03/2017 09:18 AM, Michal Hocko wrote:
On Wed 20-09-17 16:17:10, Pavel Tatashin wrote:
Some memory is reserved but unavailable: not present in memblock.memory
(because not backed by physical pages), but present in memblock.reserved.
Such memory has backing struct pages, but they are not ini
On 10/03/2017 09:08 AM, Michal Hocko wrote:
On Wed 20-09-17 16:17:08, Pavel Tatashin wrote:
Add struct page zeroing as a part of initialization of other fields in
__init_single_page().
This single thread performance collected on: Intel(R) Xeon(R) CPU E7-8895
v3 @ 2.60GHz with 1T of memory (26
Acked-by: Michal Hocko
Thank you,
Pasha
Hi Michal,
Please be explicit that this is possible only because we discard
memblock data later after 3010f876500f ("mm: discard memblock data
later"). Also be more explicit how the new code works.
OK
I like how the resulting code is more compact and smaller.
That was the goal :)
for_e
As you separated x86 and sparc patches doing essentially the same I
assume David is going to take this patch?
Correct, I noticed that usually platform specific changes are done in
separate patches even if they are small. Dave already Acked this patch.
So, I do not think it should be separated
Hi Michal,
I hope I haven't missed anything but it looks good to me.
Acked-by: Michal Hocko
Thank you for your review.
one nit below
---
arch/x86/mm/init_64.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
in
Hi Mark,
I considered using a new *populate() function for shadow without using
vmemmap_populate(), but that makes things unnecessary complicated:
vmemmap_populate() has builtin:
1. large page support
2. device memory support
3. node locality support
4. several config based variants on diffe
It will be best if we can support TSC sync capability in x86, but seems
is not easy.
Sure, your hardware achieving sync would be best, but even if it does
not, we can still use TSC. Using notsc simple because you fail to sync
TSCs is quite crazy.
The thing is, we need to support unsync'ed TSC i
. IMO, the later is better, but
either way works for me.
Thank you,
Pasha
On 09/27/2017 09:52 AM, Dou Liyang wrote:
Hi Pasha, Peter
At 09/27/2017 09:16 PM, Pasha Tatashin wrote:
Hi Peter,
I am totally happy with removing notsc. This certainly simplifies the
sched_clock code. Are there any issues
Hi Russell,
This might be so for ARM, and in fact if you look at my SPARC
implementation, I simply made source clock initialize early, so regular
sched_clock() is used. As on SPARC, we use either %tick or %stick
registers with frequency determined via OpenFrimware. But, on x86 there
are dozen
Hi Peter,
I am totally happy with removing notsc. This certainly simplifies the
sched_clock code. Are there any issues with removing existing kernel
parameters that I should be aware of?
Thank you,
Pasha
On 09/27/2017 09:10 AM, Peter Zijlstra wrote:
On Wed, Sep 27, 2017 at 02:58:57PM +0200,
Hi Fenghua,
Thank you for looking at this. Unfortunately I can't mark either of them
__init because sched_clock_early() is called from
u64 sched_clock_cpu(int cpu)
Which is around for the live of the system.
Thank you,
Pasha
On 08/30/2017 05:21 PM, Fenghua Yu wrote:
On Wed, Aug 30,
Hi Dave,
Thank you for acking.
The reason I am not doing initializing stores is because they require a
membar, even if only regular stores are following (I hoped to do a
membar before first load). This is something I was thinking was not
true, but after consulting with colleagues and checking
void __init timekeeping_init(void)
{
/*
* We must determine boot timestamp before getting current
* persistent clock value, because implementation of
* read_boot_clock64() might also call the persistent
* clock, and a leap second may occur.
And because its called only once, it does not need to be marked __init()
and must be kept around forever, right?
This is because every other architecture implements read_boot_clock64()
without __init: arm, s390. Beside, the original weak stub does not have __init
macro. So, I can certainly try t
Hi Thomas,
Thank you for your comments. My replies below.
+/*
+ * Called once during to boot to initialize boot time.
+ */
+void read_boot_clock64(struct timespec64 *ts)
And because its called only once, it does not need to be marked __init()
and must be kept around forever, right?
This is
Hi Michal,
While working on a bug that was reported to me by "kernel test robot".
unable to handle kernel NULL pointer dereference at (null)
The issue was that page_to_pfn() on that configuration was looking for a
section inside flags fields in "struct page". So, reserved but
unava
to use
this iterator, which will simplify it.
Pasha
On 08/14/2017 09:51 AM, Pasha Tatashin wrote:
mem_init()
free_all_bootmem()
free_low_memory_core_early()
for_each_reserved_mem_region()
reserve_bootmem_region()
init_reserved_page() <- if this is deferr
Hi Dou,
Thank you for your comments:
{
x86_init.timers.timer_init();
tsc_init();
+tsc_early_fini();
tsc_early_fini() is defined in patch 2, I guess you may miss it
when you split your patches.
Indeed, I will move it to patch 2.
+static DEFINE_STATIC_KEY_TRUE(__use_sched_clo
However, now thinking about it, I will change it to CONFIG_MEMBLOCK_DEBUG,
and let users decide what other debugging configs need to be enabled, as
this is also OK.
Actually the more I think about it the more I am convinced that a kernel
boot parameter would be better because it doesn't need the
mem_init()
free_all_bootmem()
free_low_memory_core_early()
for_each_reserved_mem_region()
reserve_bootmem_region()
init_reserved_page() <- if this is deferred reserved page
__init_single_pfn()
__init_single_page()
So, currently, we are using the value of page->f
#ifdef CONFIG_MEMBLOCK in page_alloc, or define memblock_discard() stubs in
nobootmem headfile.
This is the standard way to do this. And it is usually preferred to
proliferate ifdefs in the code.
Hi Michal,
As you suggested, I sent-out this patch separately. If you feel
strongly, that this s
OK, I will post it separately. No it does not depend on the rest, but the
reset depends on this. So, I am not sure how to enforce that this comes
before the rest.
Andrew will take care of that. Just make it explicit that some of the
patch depends on an earlier work when reposting.
Ok.
Yes, t
On 08/14/2017 07:43 AM, Michal Hocko wrote:
register_page_bootmem_info
register_page_bootmem_info_node
get_page_bootmem
.. setting fields here ..
such as: page->freelist = (void *)type;
free_all_bootmem()
free_low_memory_core_early()
for_each_reserved_mem_region()
reserve
Correct, the pgflags asserts were triggered when we were setting reserved
flags to struct page for PFN 0 in which was never initialized through
__init_single_page(). The reason they were triggered is because we set all
uninitialized memory to ones in one of the debug patches.
And why don't we ne
Hi Michal,
This suggestion won't work, because there are arches without memblock
support: tile, sh...
So, I would still need to have:
#ifdef CONFIG_MEMBLOCK in page_alloc, or define memblock_discard() stubs
in nobootmem headfile. In either case it would become messier than what
it is right
Sure, I could do this, but as I understood from earlier Dave Miller's
comments, we should do one logical change at a time. Hence, introduce API in
one patch use it in another. So, this is how I tried to organize this patch
set. Is this assumption incorrect?
Well, it really depends. If the patch
I will address your comment, and send out a new patch. Should I send it out
separately from the series or should I keep it inside?
I would post it separatelly. It doesn't depend on the rest.
OK, I will post it separately. No it does not depend on the rest, but
the reset depends on this. So, I
When CONFIG_DEBUG_VM is enabled, this patch sets all the memory that is
returned by memblock_virt_alloc_try_nid_raw() to ones to ensure that no
places excpect zeroed memory.
Please fold this into the patch which introduces
memblock_virt_alloc_try_nid_raw.
OK
I am not sure CONFIG_DEBUG_VM is
Clients can call alloc_large_system_hash() with flag: HASH_ZERO to specify
that memory that was allocated for system hash needs to be zeroed,
otherwise the memory does not need to be zeroed, and client will initialize
it.
If memory does not need to be zero'd, call the new
memblock_virt_alloc_raw(
On 08/11/2017 09:04 AM, Michal Hocko wrote:
On Mon 07-08-17 16:38:47, Pavel Tatashin wrote:
Replace allocators in sprase-vmemmap to use the non-zeroing version. So,
we will get the performance improvement by zeroing the memory in parallel
when struct pages are zeroed.
First of all this should
Add an optimized mm_zero_struct_page(), so struct page's are zeroed without
calling memset(). We do eight to tent regular stores based on the size of
struct page. Compiler optimizes out the conditions of switch() statement.
Again, this doesn't explain why we need this. You have mentioned those
r
I believe this deserves much more detailed explanation why this is safe.
What actually prevents any pfn walker from seeing an uninitialized
struct page? Please make your assumptions explicit in the commit log so
that we can check them independently.
There is nothing prevents pfn walkers from wal
On 08/11/2017 08:39 AM, Michal Hocko wrote:
On Mon 07-08-17 16:38:41, Pavel Tatashin wrote:
A new variant of memblock_virt_alloc_* allocations:
memblock_virt_alloc_try_nid_raw()
- Does not zero the allocated memory
- Does not panic if request cannot be satisfied
OK, this looks good b
On 08/11/2017 05:37 AM, Michal Hocko wrote:
On Mon 07-08-17 16:38:39, Pavel Tatashin wrote:
In deferred_init_memmap() where all deferred struct pages are initialized
we have a check like this:
if (page->flags) {
VM_BUG_ON(page_zone(page) != zone);
goto free_range;
I guess this goes all the way down to
Fixes: 7e18adb4f80b ("mm: meminit: initialise remaining struct pages in parallel
with kswapd")
I will add this to the patch.
Signed-off-by: Pavel Tatashin
Reviewed-by: Steven Sistare
Reviewed-by: Daniel Jordan
Reviewed-by: Bob Picco
Considering that
AFAIU register_page_bootmem_info_node is only about struct pages backing
pgdat, usemap and memmap. Those should be in reserved memblocks and we
do not initialize those at later times, they are not relevant to the
deferred initialization as your changelog suggests so the ordering with
get_page_boot
Struct pages are initialized by going through __init_single_page(). Since
the existing physical memory in memblock is represented in memblock.memory
list, struct page for every page from this list goes through
__init_single_page().
By a page _from_ this list you mean struct pages backing the phy
On 08/11/2017 03:58 AM, Michal Hocko wrote:
[I am sorry I didn't get to your previous versions]
Thank you for reviewing this work. I will address your comments, and
send-out a new patches.
In this work we do the following:
- Never read access struct page until it was initialized
How is t
On 2017-08-08 09:15, David Laight wrote:
From: Pasha Tatashin
Sent: 08 August 2017 12:49
Thank you for looking at this change. What you described was in my
previous iterations of this project.
See for example here: https://lkml.org/lkml/2017/5/5/369
I was asked to remove that flag, and only
1 - 100 of 133 matches
Mail list logo