[ add linux-coco ]

Roman Kisel wrote:
> 
> 
> On 4/9/2025 9:03 AM, Robin Murphy wrote:
> > On 2025-04-09 1:08 am, Roman Kisel wrote:
> >> Bounce-buffering makes the system spend more time copying
> >> I/O data. When the I/O transaction take place between
> >> a confidential and a non-confidential endpoints, there is
> >> no other way around.
> >>
> >> Introduce a device bitfield to indicate that the device
> >> doesn't need to perform bounce buffering. The capable
> >> device may employ it to save on copying data around.
> > 
> > It's not so much about bounce buffering, it's more fundamentally about 
> > whether the device is trusted and able to access private memory at all 
> > or not. And performance is hardly the biggest concern either - if you do 
> > trust a device to operate on confidential data in private memory, then 
> > surely it is crucial to actively *prevent* that data ever getting into 
> > shared SWIOTLB pages where anyone else could also get at it. At worst 
> > that means CoCo VMs might need an *additional* non-shared SWIOTLB to 
> > support trusted devices with addressing limitations (and/or 
> > "swiotlb=force" debugging, potentially).
> 
> Thanks, I should've highlighted that facet most certainly!

One would hope that no one is building a modern device with trusted I/O
capability, *and* with a swiotlb addressing dependency. However, I agree
that a non-shared swiotlb would be needed in such a scenario.

Otherwise the policy around "a device should not even be allowed to
bounce buffer any private page" is a userspace responsibility to either
not load the driver, not release secrets to this CVM, or otherwise make
sure the device is only ever bounce buffering private memory that does
not contain secrets.

> > Also whatever we do for this really wants to tie in with the nascent 
> > TDISP stuff as well, since we definitely don't want to end up with more 
> > than one notion of whether a device is in a trusted/locked/private/etc. 
> > vs. unlocked/shared/etc. state with respect to DMA (or indeed anything 
> > else if we can avoid it).
> 
> Wouldn't TDISP be per-device as well? In which case, a flag would be
> needed just as being added in this patch.
> 
> Although, there must be a difference between a device with TDISP where
> the flag would be the indication of the feature, and this code where the
> driver may flip that back and forth...
> 
> Do you feel this is shoehorned in `struct device`? I couldn't find an
> appropriate private (== opaque pointer) part in the structure to store
> that bit (`struct device_private` wouldn't fit the bill) and looked like
> adding it to the struct itself would do no harm. However, my read of the
> room is that folks see that as dubious :)
> 
> What would be your opinion on where to store that flag to tie together
> its usage in the Hyper-V SCSI and not bounce-buffering?

The name and location of a flag bit is not the issue, it is the common
expectation of how and when that flag is set.

tl;dr Linux likely needs a "private_accepted" flag for devices

Like Christoph said, a driver really has no business opting itself into
different DMA addressing domains. For TDISP we are also being careful to
make sure that flipping a device from shared to private is a suitably
violent event. This is because the Linux DMA layer does not have a
concept of allowing a device to have mappings from two different
addressing domains simultaneously.

In the current TDISP proposal, a device starts in shared mode and only
after validating all of the launch state of the CVM, device
measurements, and a device interface report is it granted access to
private memory. Without dumping a bunch of golden measurement data into
the kernel that validation can really only be performed by userspace.

Enter this vmbus proposal that wants to emulate devices with a paravisor
that is presumably within the TCB at launch, but the kernel can not
really trust that until a "launch state of the CVM + paravisor"
attestation event.

Like PCIe TDISP the capability of this device to access private memory
is a property of the bus and the iommu. However, acceptance of the
device into private operation is a willful policy action. It needs to
validate not only the device provenance and state, but also the Linux
DMA layer requirements of not holding shared or swiotlb mappings over
the "entry into private mode operation" event.

All that said, I would advocate to require a userspace driven "device
accept" event for all devices, not just TDISP, that want to enter
private operation. Maybe later circle back to figure out if there is a
lingering need for accepting devices via golden measurement, or other
means, to skip the userpace round-trip dependency.

A "private_capable" flag might also make sense, but that is really a
property of a bus that need not be carried necessarily in 'struct
device'.

So for this confidential vmbus SCSI device to mesh with the mechanisms
needed for TDISP I would expect it continues to launch in swiotlb mode
by default. Export an attribute via hv_bus->dev_groups to indicate that
the device is "private_capable" and then require userspace to twiddle a
private_accepted flag with some safety for in-flight DMA.

Reply via email to