On Tue, Nov 17, 2020 at 04:38:52PM +0100, David Hildenbrand wrote:
> Sorry for the late replay, fairly busy with all kinds of things.

Heh, no worries, I appreciate the time :-)

> > diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c
> > index b02fd51e5589..6b57bf90ca72 100644
> > --- a/drivers/acpi/acpi_memhotplug.c
> > +++ b/drivers/acpi/acpi_memhotplug.c
> > @@ -195,7 +195,7 @@ static int acpi_memory_enable_device(struct 
> > acpi_memory_device *mem_device)
> >                     node = memory_add_physaddr_to_nid(info->start_addr);
> >             result = __add_memory(node, info->start_addr, info->length,
> > -                                 MHP_NONE);
> > +                                 MEMHP_MEMMAP_ON_MEMORY);
> 
> I'd suggest moving that into a separate patch.

Fine by me

> > diff --git a/include/linux/memory.h b/include/linux/memory.h
> > index 439a89e758d8..7cc93de5856c 100644
> > --- a/include/linux/memory.h
> > +++ b/include/linux/memory.h
> > @@ -30,6 +30,7 @@ struct memory_block {
> >     int phys_device;                /* to which fru does this belong? */
> >     struct device dev;
> >     int nid;                        /* NID for this memory block */
> > +   unsigned long nr_vmemmap_pages; /* Number for vmemmap pages */
> 
> Maybe also document that these pages are directly at the beginning of the
> memory block.

Sure

> > -static inline int offline_pages(unsigned long start_pfn, unsigned long 
> > nr_pages)
> > +static inline int offline_pages(unsigned long start_pfn, unsigned long 
> > nr_pages,
> > +                           unsigned long nr_vmemmap_pages)
> >   {
> >     return -EINVAL;
> >   }
> > @@ -369,10 +372,12 @@ extern int add_memory_driver_managed(int nid, u64 
> > start, u64 size,
> >                                  mhp_t mhp_flags);
> >   extern void move_pfn_range_to_zone(struct zone *zone, unsigned long 
> > start_pfn,
> >                                unsigned long nr_pages,
> > +                              unsigned long nr_vmemmap_pages,
> >                                struct vmem_altmap *altmap, int migratetype);
> >   extern void remove_pfn_range_from_zone(struct zone *zone,
> >                                    unsigned long start_pfn,
> > -                                  unsigned long nr_pages);
> > +                                  unsigned long nr_pages,
> > +                                  unsigned long nr_vmemmap_pages);
> 
> I think we should not pass nr_vmemmap_pages down here but instead do two
> separate calls to move_pfn_range_to_zone()/remove_pfn_range_from_zone() from
> online_pages()/offline_pages()
> 
> 1. for vmemmap pages, migratetype = MIGRATE_UNMOVABLE
> 2. for remaining pages, migratetype = MIGRATE_ISOLATE

Ok, that was the other option, it might be even cleaner.

> > +   valid_start_pfn = pfn + nr_vmemmap_pages;
> > +   valid_nr_pages = nr_pages - nr_vmemmap_pages;
> 
> Hm, valid sounds strange. More like "free_start_pfn" or "buddy_start_pfn".

Agreed, I might lean towards buddy_start_pfn.

> > -   move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
> > +   move_pfn_range_to_zone(zone, pfn, nr_pages, nr_vmemmap_pages, NULL,
> > +                          MIGRATE_ISOLATE);
> 
> As mentioned, I'd suggest properly initializing the memmap here
> 
> if (nr_vmemmap_pages) {
>       move_pfn_range_to_zone(zone, pfn, nr_vmemmap_pages, NULL,
>                              MIGRATE_UNMOVABLE);
> }
> move_pfn_range_to_zone(zone, valid_start_pfn, valid_nr_pages, NULL,

Sure, agreed.

> > +   if (!support_memmap_on_memory(size))
> > +           mhp_flags &= ~MEMHP_MEMMAP_ON_MEMORY;
> 
> Callers (e.g., virtio-mem) might rely on this. We should reject this with
> -EINVAL and provide a way for callers to test whether this flag is possible.

Uhm, we might want to make "support_memmap_on_memory" public, and
callers who might want to it use can check its return value?
Or do you have something else in mind?

Agreed on the -EINVAIL.

> > +   if (mhp_flags & MEMHP_MEMMAP_ON_MEMORY)
> > +           mhp_mark_vmemmap_pages(params.altmap);
> 
> Do we really still need that? Pages are offline, so we're messing with an
> invalid memmap. online_pages() should handle initializing the memmap of
> these pages.

Yeah, on a second thought we do not need this.
Since the pages are still offline, no one should be messing with that
range yet anyway.

> 
> [...]
> 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index e74ca22baaa1..043503fb8c6e 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -8761,6 +8761,13 @@ void __offline_isolated_pages(unsigned long 
> > start_pfn, unsigned long end_pfn)
> >     spin_lock_irqsave(&zone->lock, flags);
> >     while (pfn < end_pfn) {
> >             page = pfn_to_page(pfn);
> > +           /*
> > +            * Skip vmemmap pages
> > +            */
> > +           if (PageVmemmap(page)) {
> > +                   pfn += vmemmap_nr_pages(page);
> > +                   continue;
> > +           }
> 
> I'd assume calling code can handle that and exclude isolating such pages.

The thing is that __offline_isolated_pages calls offline_mem_sections(),
so we really need the first pfn, and not the "pfn + nr_vmemmap_pages".
Instead of skipping it in the loop, I might just skip it before entering
the loop.

Thanks!

-- 
Oscar Salvador
SUSE L3

Reply via email to