The Hyper-V balloon driver installs a custom callback for handling page
onlining operations performed by the memory hotplug subsystem. This
custom callback is global, and overrides the default callback
(generic_online_page) that Linux otherwise uses. The custom callback
properly handles memory that is hot-added by the balloon driver as part
of a Hyper-V hot-add region.

But memory can also be hot-added directly by a device driver for a vPCI
device, particularly GPUs. In such a case, the custom callback installed by
the balloon driver runs, but won't find the page in its hot-add region list
and doesn't online it, which could cause driver initialization failures.

Fix this by having the balloon custom callback run generic_online_page()
when the page isn't part of a Hyper-V hot-add region, thereby doing the
default Linux behavior. This allows device driver hot-adds to work
properly. Similar cases are handled the same way in the virtio-mem driver.

Suggested-by: Vikram Sethi <vse...@nvidia.com>
Tested-by: Michael Frohlich <mfrohl...@microsoft.com>
Reviewed-by: Michael Kelley <mhkli...@outlook.com>
Signed-off-by: Jacob Pan <jacob....@linux.microsoft.com>
---
v2: Updated commit message suggested by Michael Kelley.
---
 drivers/hv/hv_balloon.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index a99112e6f0b8..c999daf34108 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -766,16 +766,18 @@ static void hv_online_page(struct page *pg, unsigned int 
order)
        struct hv_hotadd_state *has;
        unsigned long pfn = page_to_pfn(pg);
 
-       guard(spinlock_irqsave)(&dm_device.ha_lock);
-       list_for_each_entry(has, &dm_device.ha_region_list, list) {
-               /* The page belongs to a different HAS. */
-               if (pfn < has->start_pfn ||
-                   (pfn + (1UL << order) > has->end_pfn))
-                       continue;
+       scoped_guard(spinlock_irqsave, &dm_device.ha_lock) {
+               list_for_each_entry(has, &dm_device.ha_region_list, list) {
+                       /* The page belongs to a different HAS. */
+                       if (pfn < has->start_pfn ||
+                               (pfn + (1UL << order) > has->end_pfn))
+                               continue;
 
-               hv_bring_pgs_online(has, pfn, 1UL << order);
-               break;
+                       hv_bring_pgs_online(has, pfn, 1UL << order);
+                       return;
+               }
        }
+       generic_online_page(pg, order);
 }
 
 static int pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
-- 
2.34.1


Reply via email to