From: Sonia Sharma <so...@linux.microsoft.com> Sent: Friday, December 20, 2024 3:56 PM >
Please include the "linux-hyperv@vger.kernel.org" mailing list when submitting patches related to Hyper-V. > Harden hv_util type device drivers to allow single > instance of the device be configured at given time. > I think the reason for this patch needs more explanation. For several VMBus devices, a well-behaved Hyper-V is expected to offer only one instance of the device in a given VM. Linux guests originally assumed that the Hyper-V host is well-behaved, so the device drivers for many of these devices were written assuming only a single instance. But with the introduction of Confidential Computing (CoCo) VMs, Hyper-V is no longer assumed to be well-behaved. If a compromised & malicious Hyper-V were to offer multiple instances of such a device, the device driver assumption about a single instance would be false, and memory corruption could occur, which has the potential to lead to compromise of the CoCo VM. The intent is to prevent such a scenario. Note that this problem extends beyond just "util" devices. Hyper-V is expected to offer only a single instance of keyboard, mouse, frame buffer, and balloon devices as well. So this patch should be extended to include them as well (and your new function names containing "hv_util" should be adjusted). Interestingly, the Hyper-V keyboard driver does not assume a single instance, so it should be safe regardless. But the mouse, frame buffer, and balloon drivers are not safe. With this understanding, there are two ways to approach the problem: 1) Enforce the expectation that a well-behaved Hyper-V only offers a single instance of these VMBus devices. That's the approach that this patch takes. 2) Update the device drivers to remove the assumption of a single instance. With this approach, if a compromised & malicious Hyper-V were to offer multiple instances, the extra devices might be bogus, but memory corruption would not occur and the integrity of the CoCo VM should not be compromised. As mentioned above, such is already the case with the keyboard driver. I've thought about the tradeoffs for the two approaches, and don't really have a strong opinion either way. In some sense, #2 is the more correct approach as ideally device drivers shouldn't make single instance assumptions. But #1 is an easier fix, and perhaps more robust. Other reviewers might have other reasons to prefer one over the other, and have a stronger viewpoint on the tradeoffs. I would be interested in any such comments. But I'm OK with approach #1 unless someone points out a good reason to prefer #2. > > New function vmbus_is_valid_hvutil_offer() is added. > It checks if the new offer is for hv_util device type, > then read the refcount for that device and accept or > reject the offer accordingly. > > Signed-off-by: Sonia Sharma <sonia.sha...@linux.microsoft.com> > --- > drivers/hv/channel_mgmt.c | 64 ++++++++++++++++++++++++++++++++++++++- > 1 file changed, 63 insertions(+), 1 deletion(-) > > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c > index 3c6011a48dab..1a135cfad81f 100644 > --- a/drivers/hv/channel_mgmt.c > +++ b/drivers/hv/channel_mgmt.c > @@ -20,6 +20,7 @@ > #include <linux/delay.h> > #include <linux/cpu.h> > #include <linux/hyperv.h> > +#include <linux/refcount.h> > #include <asm/mshyperv.h> > #include <linux/sched/isolation.h> > > @@ -156,6 +157,8 @@ const struct vmbus_device vmbus_devs[] = { > }; > EXPORT_SYMBOL_GPL(vmbus_devs); > > +static refcount_t singleton_vmbus_devs[HV_UNKNOWN + 1]; > + >From what I can see, these refcounts always have a value of either 0 or 1. The refcount never goes above 1 because the intent of this patch is to enforce having only a single instance. The patch reads the refcount, sets it to 1, or decrements it. I think you could just use an array of booleans here, and set them to true or false. READ_ONCE() and WRITE_ONCE() should be used because the booleans are accessed from multiple threads. > static const struct { > guid_t guid; > } vmbus_unsupported_devs[] = { > @@ -198,6 +201,25 @@ static bool is_unsupported_vmbus_devs(const guid_t *guid) > return false; > } > > +static bool is_dev_hv_util(u16 dev_type) > +{ > + switch (dev_type) { > + case HV_SHUTDOWN: > + fallthrough; > + case HV_TS: > + fallthrough; > + case HV_HB: > + fallthrough; > + case HV_KVP: > + fallthrough; > + case HV_BACKUP: > + return true; > + > + default: > + return false; > + } > +} > + Rather than have a big case statement here, we already have the "vmbus_devs" array that statically specifies various properties of each VMBus device type. I'd suggest adding a field to those array entries to indicate whether the device type is expected to be a singleton. Then you can just do a direct lookup, like with the "perf_device" and "allowed_in_isolated" properties. > static u16 hv_get_dev_type(const struct vmbus_channel *channel) > { > const guid_t *guid = &channel->offermsg.offer.if_type; > @@ -1004,6 +1026,26 @@ find_primary_channel_by_offer(const struct > vmbus_channel_offer_channel *offer) > return channel; > } > > +static u16 vmbus_is_valid_hvutil_offer(const struct > vmbus_channel_offer_channel *offer) > +{ > + const guid_t *guid = &offer->offer.if_type; > + u16 i; > + > + if (is_hvsock_offer(offer)) > + return HV_UNKNOWN; > + > + for (i = HV_IDE; i < HV_UNKNOWN; i++) { > + if (guid_equal(guid, &vmbus_devs[i].guid) && is_dev_hv_util(i)) > { Ideally, we should avoid coding yet another case of searching through the vmbus_devs[] array for a matching GUID. The function hv_get_dev_type() already does this, and returns the index into the vmbus_devs[] array. You could probably use that function, and then just pass the index as the argument to this function. That index is also stored as the "device_id" (arguably mis-named) in the struct vmbus_channel, so it's already available in the rescind path. > + if (refcount_read(&singleton_vmbus_devs[i])) > + return HV_UNKNOWN; > + refcount_set(&singleton_vmbus_devs[i], 1); > + return i; > + } > + } > + > + return i; > +} > + > static bool vmbus_is_valid_offer(const struct vmbus_channel_offer_channel > *offer) > { > const guid_t *guid = &offer->offer.if_type; > @@ -1031,6 +1073,7 @@ static void vmbus_onoffer(struct > vmbus_channel_message_header *hdr) > struct vmbus_channel_offer_channel *offer; > struct vmbus_channel *oldchannel, *newchannel; > size_t offer_sz; > + u16 dev_type; > > offer = (struct vmbus_channel_offer_channel *)hdr; > > @@ -1115,11 +1158,29 @@ static void vmbus_onoffer(struct > vmbus_channel_message_header *hdr) > return; > } > > + /* > + * If vmbus_is_valid_offer() returns - > + * HV_UNKNOWN - Subsequent offer received for hv_util dev, thus > reject offer. > + * HV_SHUTDOWN|HV_TS|HV_KVP|HV_HB|HV-KVP|HV_BACKUP - Increment > refcount > + * Others - Continue as is without doing anything. > + * > + * NOTE - We do not want to increase refcount if we resume from > hibernation. > + */ > + dev_type = vmbus_is_valid_hvutil_offer(offer); > + if (dev_type == HV_UNKNOWN) { > + pr_err_ratelimited("Invalid hv_util offer %d from the host > supporting " > + "isolation\n", offer->child_relid); This check for multiple instances of a singleton device is not limited to just CoCo VMs (a.k.a. "isolated VMs"). So the error message here really shouldn't reference "host supporting isolation". > + atomic_dec(&vmbus_connection.offer_in_progress); > + return; > + } > + > /* Allocate the channel object and save this offer. */ > newchannel = alloc_channel(); > if (!newchannel) { > vmbus_release_relid(offer->child_relid); > atomic_dec(&vmbus_connection.offer_in_progress); > + if (is_dev_hv_util(dev_type)) > + refcount_dec(&singleton_vmbus_devs[dev_type]); It might be good to have a function that combines the above two lines. Then the two parallel functions are: 1) vmbus_is_valid_hvutil_offer() which marks a singleton device as "already present" [and that function probably needs a new name] 2) vmbus_clear_singleton_device(), [or something similar] that clears the boolean if it is a singleton device. vmbus_clear_singleton_device() would also be used in the rescind path and in the vmbus_add_channel_work() error path that I mention below. > pr_err("Unable to allocate channel object\n"); > return; > } There's another error case in the channel offer path that needs to be handled. vmbus_add_channel_work() can fail, in which case the new channel is cleaned up and removed. The accounting of singleton devices must be updated if the channel is deleted via this error path. > @@ -1235,7 +1296,6 @@ static void vmbus_onoffer_rescind(struct > vmbus_channel_message_header *hdr) > /* > * At this point, the rescind handling can proceed safely. > */ > - This is a spurious whitespace change that should be avoided. > if (channel->device_obj) { > if (channel->chn_rescind_callback) { > channel->chn_rescind_callback(channel); > @@ -1251,6 +1311,8 @@ static void vmbus_onoffer_rescind(struct > vmbus_channel_message_header *hdr) > */ > dev = get_device(&channel->device_obj->device); > if (dev) { > + if (is_dev_hv_util(hv_get_dev_type(channel))) As noted above, the "dev_type" is already stored in the channel structure as field "device_id" (which is a bit mis-named). Michael > + > refcount_dec(&singleton_vmbus_devs[hv_get_dev_type(channel)]); > vmbus_device_unregister(channel->device_obj); > put_device(dev); > }