Re: (subset) [PATCH 0/5] hyper-v: Don't assume cpu_possible_mask is dense
On Wed, 02 Oct 2024 20:53:28 -0700, mhkelle...@gmail.com wrote: > Code specific to Hyper-V guests currently assumes the cpu_possible_mask > is "dense" -- i.e., all bit positions 0 thru (nr_cpu_ids - 1) are set, > with no "holes". Therefore, num_possible_cpus() is assumed to be equal > to nr_cpu_ids. > > Per a separate discussion[1], this assumption is not valid in the > general case. For example, the function setup_nr_cpu_ids() in > kernel/smp.c is coded to assume cpu_possible_mask may be sparse, > and other patches have been made in the past to correctly handle > the sparseness. See bc75e99983df1efd ("rcu: Correctly handle sparse > possible cpu") as noted by Mark Rutland. > > [...] Applied to 6.14/scsi-queue, thanks! [4/5] scsi: storvsc: Don't assume cpu_possible_mask is dense https://git.kernel.org/mkp/scsi/c/6cb7063feb2e -- Martin K. Petersen Oracle Linux Engineering
[PATCH] hv_balloon: Fallback to generic_online_page() for non-HV hot added mem
When device memory blocks are hot-added via add_memory_driver_managed(), the HV balloon driver intercepts them but fails to online these memory blocks. This could result in device driver initialization failures. To address this, fall back to the generic online callback for memory blocks not added by the HV balloon driver. Similar cases are handled the same way in virtio-mem driver. Suggested-by: Vikram Sethi Tested-by: Michael Frohlich Signed-off-by: Jacob Pan --- drivers/hv/hv_balloon.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index a99112e6f0b8..c999daf34108 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -766,16 +766,18 @@ static void hv_online_page(struct page *pg, unsigned int order) struct hv_hotadd_state *has; unsigned long pfn = page_to_pfn(pg); - guard(spinlock_irqsave)(&dm_device.ha_lock); - list_for_each_entry(has, &dm_device.ha_region_list, list) { - /* The page belongs to a different HAS. */ - if (pfn < has->start_pfn || - (pfn + (1UL << order) > has->end_pfn)) - continue; + scoped_guard(spinlock_irqsave, &dm_device.ha_lock) { + list_for_each_entry(has, &dm_device.ha_region_list, list) { + /* The page belongs to a different HAS. */ + if (pfn < has->start_pfn || + (pfn + (1UL << order) > has->end_pfn)) + continue; - hv_bring_pgs_online(has, pfn, 1UL << order); - break; + hv_bring_pgs_online(has, pfn, 1UL << order); + return; + } } + generic_online_page(pg, order); } static int pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt) -- 2.34.1
[PATCH v2] uio_hv_generic: Add a check for HV_NIC for send, receive buffers setup
Receive and send buffer allocation was originally introduced to support DPDK's networking use case. These buffer sizes were further increased to meet DPDK performance requirements. However, these large buffers are unnecessary for any other UIO use cases. Restrict the allocation of receive and send buffers only for HV_NIC device type, saving 47 MB of memory per device. While at it, fix some of the syntax related issues in the touched code which are reported by "--strict" option of checkpatch. Signed-off-by: Naman Jain --- Changes since v1: https://lore.kernel.org/all/20241125125015.1500-1-namj...@linux.microsoft.com/ Rephrased commit msg as per Saurabh's suggestion. --- drivers/uio/uio_hv_generic.c | 86 ++-- 1 file changed, 43 insertions(+), 43 deletions(-) diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c index 3976360d0096..1b19b5647495 100644 --- a/drivers/uio/uio_hv_generic.c +++ b/drivers/uio/uio_hv_generic.c @@ -296,51 +296,51 @@ hv_uio_probe(struct hv_device *dev, pdata->info.mem[MON_PAGE_MAP].size = PAGE_SIZE; pdata->info.mem[MON_PAGE_MAP].memtype = UIO_MEM_LOGICAL; - pdata->recv_buf = vzalloc(RECV_BUFFER_SIZE); - if (pdata->recv_buf == NULL) { - ret = -ENOMEM; - goto fail_free_ring; + if (channel->device_id == HV_NIC) { + pdata->recv_buf = vzalloc(RECV_BUFFER_SIZE); + if (!pdata->recv_buf) { + ret = -ENOMEM; + goto fail_free_ring; + } + + ret = vmbus_establish_gpadl(channel, pdata->recv_buf, + RECV_BUFFER_SIZE, &pdata->recv_gpadl); + if (ret) { + if (!pdata->recv_gpadl.decrypted) + vfree(pdata->recv_buf); + goto fail_close; + } + + /* put Global Physical Address Label in name */ + snprintf(pdata->recv_name, sizeof(pdata->recv_name), +"recv:%u", pdata->recv_gpadl.gpadl_handle); + pdata->info.mem[RECV_BUF_MAP].name = pdata->recv_name; + pdata->info.mem[RECV_BUF_MAP].addr = (uintptr_t)pdata->recv_buf; + pdata->info.mem[RECV_BUF_MAP].size = RECV_BUFFER_SIZE; + pdata->info.mem[RECV_BUF_MAP].memtype = UIO_MEM_VIRTUAL; + + pdata->send_buf = vzalloc(SEND_BUFFER_SIZE); + if (!pdata->send_buf) { + ret = -ENOMEM; + goto fail_close; + } + + ret = vmbus_establish_gpadl(channel, pdata->send_buf, + SEND_BUFFER_SIZE, &pdata->send_gpadl); + if (ret) { + if (!pdata->send_gpadl.decrypted) + vfree(pdata->send_buf); + goto fail_close; + } + + snprintf(pdata->send_name, sizeof(pdata->send_name), +"send:%u", pdata->send_gpadl.gpadl_handle); + pdata->info.mem[SEND_BUF_MAP].name = pdata->send_name; + pdata->info.mem[SEND_BUF_MAP].addr = (uintptr_t)pdata->send_buf; + pdata->info.mem[SEND_BUF_MAP].size = SEND_BUFFER_SIZE; + pdata->info.mem[SEND_BUF_MAP].memtype = UIO_MEM_VIRTUAL; } - ret = vmbus_establish_gpadl(channel, pdata->recv_buf, - RECV_BUFFER_SIZE, &pdata->recv_gpadl); - if (ret) { - if (!pdata->recv_gpadl.decrypted) - vfree(pdata->recv_buf); - goto fail_close; - } - - /* put Global Physical Address Label in name */ - snprintf(pdata->recv_name, sizeof(pdata->recv_name), -"recv:%u", pdata->recv_gpadl.gpadl_handle); - pdata->info.mem[RECV_BUF_MAP].name = pdata->recv_name; - pdata->info.mem[RECV_BUF_MAP].addr - = (uintptr_t)pdata->recv_buf; - pdata->info.mem[RECV_BUF_MAP].size = RECV_BUFFER_SIZE; - pdata->info.mem[RECV_BUF_MAP].memtype = UIO_MEM_VIRTUAL; - - pdata->send_buf = vzalloc(SEND_BUFFER_SIZE); - if (pdata->send_buf == NULL) { - ret = -ENOMEM; - goto fail_close; - } - - ret = vmbus_establish_gpadl(channel, pdata->send_buf, - SEND_BUFFER_SIZE, &pdata->send_gpadl); - if (ret) { - if (!pdata->send_gpadl.decrypted) - vfree(pdata->send_buf); - goto fail_close; - } - - snprintf(pdata->send_name, sizeof(pdata->send_name), -"send:%u", pdata->send_gpadl.gpadl_handle); - pdata->info.mem[SEND_BUF_MAP].name = pdata->send_name; - pdata->info.mem[SEND_BUF_MAP].addr - = (uintptr_t)pdata->send_buf; - pdata->info.mem[SEND_BUF_MAP].size = SEND_BUFFER_SIZE;
[PATCH v5 2/2] Drivers: hv: vmbus: Log on missing offers if any
From: John Starks When resuming from hibernation, log any channels that were present before hibernation but now are gone. In general, the boot-time devices configured for a resuming VM should be the same as the devices in the VM at the time of hibernation. It's uncommon for the configuration to have been changed such that offers are missing. Changing the configuration violates the rules for hibernation anyway. The cleanup of missing channels is not straight-forward and dependent on individual device driver functionality and implementation, so it can be added in future with separate changes. Signed-off-by: John Starks Co-developed-by: Naman Jain Signed-off-by: Naman Jain Reviewed-by: Easwar Hariharan Reviewed-by: Saurabh Sengar Reviewed-by: Michael Kelley --- Changes since v4: Rebased on latest tip and added Michael's Reviewed-by tag. Changes since v3: Fixed minor checkpatch style warning coming with "--strict" option, addressing Saurabh's comments. Changes since v2: * Addressed Michael's comments: * Changed commit msg as per suggestions * Addressed Dexuan's comments, which came up in offline discussion: * Minor additions in commit subject. Changes since v1: * Added Easwar's Reviewed-By tag * Addressed Saurabh's comments: * Added a note for missing channel cleanup in comments and commit msg --- drivers/hv/vmbus_drv.c | 17 + 1 file changed, 17 insertions(+) diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index bf5608a74056..0f6cd44fff29 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -2462,6 +2462,7 @@ static int vmbus_bus_suspend(struct device *dev) static int vmbus_bus_resume(struct device *dev) { + struct vmbus_channel *channel; struct vmbus_channel_msginfo *msginfo; size_t msgsize; int ret; @@ -2494,6 +2495,22 @@ static int vmbus_bus_resume(struct device *dev) vmbus_request_offers(); + mutex_lock(&vmbus_connection.channel_mutex); + list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) { + if (channel->offermsg.child_relid != INVALID_RELID) + continue; + + /* hvsock channels are not expected to be present. */ + if (is_hvsock_channel(channel)) + continue; + + pr_err("channel %pUl/%pUl not present after resume.\n", + &channel->offermsg.offer.if_type, + &channel->offermsg.offer.if_instance); + /* ToDo: Cleanup these channels here */ + } + mutex_unlock(&vmbus_connection.channel_mutex); + /* Reset the event for the next suspend. */ reinit_completion(&vmbus_connection.ready_for_suspend_event); -- 2.43.0
[PATCH v5 1/2] Drivers: hv: vmbus: Wait for boot-time offers during boot and resume
Channel offers are requested during VMBus initialization and resume from hibernation. Add support to wait for all boot-time channel offers to be delivered and processed before returning from vmbus_request_offers. This is in analogy to a PCI bus not returning from probe until it has scanned all devices on the bus. Without this, user mode can race with VMBus initialization and miss channel offers. User mode has no way to work around this other than sleeping for a while, since there is no way to know when VMBus has finished processing boot-time offers. With this added functionality, remove earlier logic which keeps track of count of offered channels post resume from hibernation. Once all offers delivered message is received, no further boot-time offers are going to be received. Consequently, logic to prevent suspend from happening after previous resume had missing offers, is also removed. Co-developed-by: John Starks Signed-off-by: John Starks Signed-off-by: Naman Jain Reviewed-by: Easwar Hariharan Reviewed-by: Saurabh Sengar Reviewed-by: Michael Kelley --- Changes since v4: Rebased and added Michael's Reviewed-by tag. Changes since v3: Fixed checkpatch style warnings coming with "--strict" option, addressing Saurabh's comments. FYI, I kept code style same as earlier for below, to keep consistency with other similar fields in the code and because of lack of options due to 100 char limit. *** CHECK: Lines should not end with a '(' FILE: drivers/hv/connection.c:37: + .all_offers_delivered_event = COMPLETION_INITIALIZER( *** Changes since v2: * Incorporated Easwar's suggestion to use secs_to_jiffies() as his changes are now merged. * Addressed Michael's comments: * Used boot-time offers/channels/devices to maintain consistency * Rephrased CHANNELMSG_ALLOFFERS_DELIVERED handler function comments for better explanation. Thanks for sharing the write-up. * Changed commit msg and other things as per suggestions * Addressed Dexuan's comments, which came up in offline discussion: * Changed timeout for waiting for all offers delivered msg to 60s instead of 10s. Reason being, the host can experience some servicing events or diagnostics events, which may take a long time and hence may fail to offer all the devices within 10s. * Minor additions in commit subject. Changes since v1: * Added Easwar's Reviewed-By tag * Addressed Michael's comments: * Added explanation of all offers delivered message in comments * Removed infinite wait for offers logic, and changed it wait once. * Removed sub channel workqueue flush logic * Added comments on why MLX device offer is not expected as part of this essential boot offer list. I refrained from adding too many details on it as it felt like it is beyond the scope of this patch series and may not be relevant to this. However, please let me know if something needs to be added. * Addressed Saurabh's comments: * Changed timeout value to 1 ms instead of 10*1 * Changed commit msg as per suggestions * Added a comment for warning case of wait_for_completion timeout --- drivers/hv/channel_mgmt.c | 61 +-- drivers/hv/connection.c | 4 +-- drivers/hv/hyperv_vmbus.h | 14 ++--- drivers/hv/vmbus_drv.c| 16 -- 4 files changed, 51 insertions(+), 44 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 3c6011a48dab..6e084c207414 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -944,16 +944,6 @@ void vmbus_initiate_unload(bool crash) vmbus_wait_for_unload(); } -static void check_ready_for_resume_event(void) -{ - /* -* If all the old primary channels have been fixed up, then it's safe -* to resume. -*/ - if (atomic_dec_and_test(&vmbus_connection.nr_chan_fixup_on_resume)) - complete(&vmbus_connection.ready_for_resume_event); -} - static void vmbus_setup_channel_state(struct vmbus_channel *channel, struct vmbus_channel_offer_channel *offer) { @@ -1109,8 +1099,6 @@ static void vmbus_onoffer(struct vmbus_channel_message_header *hdr) /* Add the channel back to the array of channels. */ vmbus_channel_map_relid(oldchannel); - check_ready_for_resume_event(); - mutex_unlock(&vmbus_connection.channel_mutex); return; } @@ -1296,13 +1284,28 @@ EXPORT_SYMBOL_GPL(vmbus_hvsock_device_unregister); /* * vmbus_onoffers_delivered - - * This is invoked when all offers have been delivered. + * The CHANNELMSG_ALLOFFERS_DELIVERED message arrives after all + * boot-time offers are delivered. A boot-time offer is for the primary + * channel for any virtual hardware configured in the VM at the time it boots. + * Boot-time offers include offers for physical devices assigned to the VM + * via Hyper-V's Discrete Device Assignment (DDA) fun
[PATCH v5 0/2] Drivers: hv: vmbus: Wait for boot-time offers and log missing offers if any
After VM requests for channel offers during boot or resume from hibernation, host offers the devices that the VM is configured with and then sends a separate message indicating that all the boot-time channel offers are delivered. Wait for this message to make this boot-time offers request and receipt process synchronous. Without this, user mode can race with VMBus initialization and miss channel offers. User mode has no way to work around this other than sleeping for a while, since there is no way to know when VMBus has finished processing boot-time offers. This is in analogy to a PCI bus not returning from probe until it has scanned all devices on the bus. As part of this implementation, some code cleanup is also done for the logic which becomes redundant due to this change. Second patch prints the channels which are not offered when resume happens from hibernation to supply more information to the end user. Changes since v4: https://lore.kernel.org/all/20241118070725.3221-1-namj...@linux.microsoft.com/ Rebased on latest tip and added Michael's Reviewed-by tag. Changes since v3: https://lore.kernel.org/all/20241113084700.2940-1-namj...@linux.microsoft.com/ Fixed checkpatch style warnings coming with "--strict" option, addressing Saurabh's comments. FYI, I kept code style same as earlier for below, to keep consistency with other similar fields in the code and because of lack of options due to 100 char limit. *** CHECK: Lines should not end with a '(' FILE: drivers/hv/connection.c:37: + .all_offers_delivered_event = COMPLETION_INITIALIZER( *** Changes since v2: https://lore.kernel.org/all/20241029080147.52749-1-namj...@linux.microsoft.com/ * Incorporated Easwar's suggestion to use secs_to_jiffies() as his changes are now merged. * Addressed Michael's comments: * Used boot-time offers/channels/devices to maintain consistency * Rephrased CHANNELMSG_ALLOFFERS_DELIVERED handler function comments for better explanation. Thanks for sharing the write-up. * Changed commit msg and other things as per suggestions * Addressed Dexuan's comments, which came up in offline discussion: * Changed timeout for waiting for all offers delivered msg to 60s instead of 10s. Reason being, the host can experience some servicing events or diagnostics events, which may take a long time and hence may fail to offer all the devices within 10s. * Minor additions in commit subject of both patches * Rebased on latest linux-next master tip Changes since v1: https://lore.kernel.org/all/20241018115811.5530-1-namj...@linux.microsoft.com/ * Added Easwar's Reviewed-By tag * Addressed Michael's comments: * Added explanation of all offers delivered message in comments * Removed infinite wait for offers logic, and changed it wait once. * Removed sub channel workqueue flush logic * Added comments on why MLX device offer is not expected as part of this essential boot offer list. I refrained from adding too many details on it as it felt like it is beyond the scope of this patch series and may not be relevant to this. However, please let me know if something needs to be added. * Addressed Saurabh's comments: * Changed timeout value to 1 ms instead of 10*1000 * Changed commit msg as per suggestions * Added a comment for warning case of wait_for_completion timeout * Added a note for missing channel cleanup in comments and commit msg John Starks (1): Drivers: hv: vmbus: Log on missing offers if any Naman Jain (1): Drivers: hv: vmbus: Wait for boot-time offers during boot and resume drivers/hv/channel_mgmt.c | 61 +-- drivers/hv/connection.c | 4 +-- drivers/hv/hyperv_vmbus.h | 14 ++--- drivers/hv/vmbus_drv.c| 31 ++-- 4 files changed, 67 insertions(+), 43 deletions(-) base-commit: 8155b4ef3466f0e289e8fcc9e6e62f3f4dceeac2 -- 2.43.0