On Wed, Jul 11, 2018 at 10:05:50AM +0100, Suzuki K Poulose wrote: > On 10/07/18 18:03, Dave Martin wrote: > >On Tue, Jul 10, 2018 at 05:38:39PM +0100, Suzuki K Poulose wrote: > >>On 09/07/18 14:37, Dave Martin wrote: > >>>On Mon, Jul 09, 2018 at 01:29:42PM +0100, Marc Zyngier wrote: > >>>>On 09/07/18 12:23, Dave Martin wrote: > > > >[...] > > > >>>>>Wedging arguments into a few bits in the type argument feels awkward, > >>>>>and may be regretted later if we run out of bits, or something can't be > >>>>>represented in the chosen encoding. > >>>> > >>>>I think that's a pretty convincing argument for a "better" CREATE_VM, > >>>>one that would have a clearly defined, structured (and potentially > >>>>extensible) argument. > >>>> > >>>>I've quickly hacked the following: > >>>> > >>>>diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > >>>>index b6270a3b38e9..3e76214034c2 100644 > >>>>--- a/include/uapi/linux/kvm.h > >>>>+++ b/include/uapi/linux/kvm.h > >>>>@@ -735,6 +735,20 @@ struct kvm_ppc_resize_hpt { > >>>> __u32 pad; > >>>> }; > >>>> > >>>>+struct kvm_create_vm2 { > >>>>+ __u64 version; /* Or maybe not */ > >>>>+ union { > >>>>+ struct { > >>>>+#define KVM_ARM_SVE_CAPABLE (1 << 0) > >>>>+#define KVM_ARM_SELECT_IPA {1 << 1) > >>>>+ __u64 capabilities; > >>>>+ __u16 sve_vlen; > >>>>+ __u8 ipa_size; > >>>>+ } arm64; > >>>>+ __u64 dummy[15]; > >>>>+ }; > >>>>+}; > >>>>+ > >>>> #define KVMIO 0xAE > >>>> > >>>> /* machine type bits, to be used as argument to KVM_CREATE_VM */ > >>>> > >>>>Other architectures could fill in their own bits if they need to. > >>>> > >>>>Thoughts? > >>> > >>>This kind of thing should work, but it may still get messy when we > >>>add additional fields. > >> > >> > >>Marc, Dave, > >> > >>I like Dave's approach. Some comments below. > >> > >>> > >>>It we want this to work cross-arch, would it make sense to go > >>>for a more generic approach, say > >>> > >>>struct kvm_create_vm_attr_any { > >>> __u32 type; > >>>}; > >>> > >>>#define KVM_CREATE_VM_ATTR_ARCH_CAPABILITIES 1 > >>>struct kvm_create_vm_attr_arch_capabilities { > >>> __u32 type; > >>> __u16 size; /* support future expansion of capabilities[] */ > >>> __u16 reserved; > >>> __u64 capabilities[1]; > >>>}; > >> > >>We also need to advertise which attributes are supported by the host, > >>so that the user can tune the available ones. That would make a bit mask > >>like the above trickier, unless we return the supported values back > >>in the argument ptr for the "probe" call. And this scheme in general > >>can be useful for passing back a non-boolean result specific to the > >>attribute, without having a per-attribute ioctl. (e.g, maximum limit > >>for IPA). > > > >Maybe, but this could quickly become bloated. (My approach already > >feels a bit bloated...) > > > >I'm not sure that arbitrarily complex negotiation will really be > >needed, but userspace might want to change its mind if setting a > >particular propertiy fails. > > > >An alternative might be to have a bunch of per-VM ioctls to configure > >different things, like x86 has. There's at least precedent for that. > >For arm, we currently only have a few. That allows for easy extension, > >at the cost of adding ioctls. > > As you know, one of the major problems with the per-VM ioctls is > the ordering of different operations and tracking to make sure that > the userspace follows the expected order. e.g, the first approach for > IPA series was based on this and it made things complex enough to drop > it.
I'm aware of that, but if we are adding a new KVM_CREATE_VM, we could perhaps give it different semantics: i.e., we create a half-created VM that only accepts configuration ioctls and a "finish creation" ioctl that finalises everything before you're allowed to create devices, vcpus etc. This is the sort of thing I was moving torwards for SVE (but for vcpus there). I'm not saying we should drop the existing KVM_CREATE_VM2 ideas, but that we should take a step back if it starts to accrue complexity. > > > >There may be some ioctls we can reuse, like KVM_ENABLE_CAP for per- > >vm capability flags. > > May be we could switch to KVM_VM_CAPS and pass a list of capabilities > to be enabled at creation time ? The kvm_enable_cap can pass in additional > arguments for each cap. That way we don't have to rely on a new set of > attributes and probing becomes straight forward. That's a possibility. I guess we'd need to understand how exactly x86 uses this. Cheers ---Dave