On 10/07/18 18:03, Dave Martin wrote:
On Tue, Jul 10, 2018 at 05:38:39PM +0100, Suzuki K Poulose wrote:
On 09/07/18 14:37, Dave Martin wrote:
On Mon, Jul 09, 2018 at 01:29:42PM +0100, Marc Zyngier wrote:
On 09/07/18 12:23, Dave Martin wrote:

[...]

Wedging arguments into a few bits in the type argument feels awkward,
and may be regretted later if we run out of bits, or something can't be
represented in the chosen encoding.

I think that's a pretty convincing argument for a "better" CREATE_VM,
one that would have a clearly defined, structured (and potentially
extensible) argument.

I've quickly hacked the following:

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b6270a3b38e9..3e76214034c2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -735,6 +735,20 @@ struct kvm_ppc_resize_hpt {
        __u32 pad;
  };

+struct kvm_create_vm2 {
+       __u64   version;        /* Or maybe not */
+       union {
+               struct {
+#define KVM_ARM_SVE_CAPABLE    (1 << 0)
+#define KVM_ARM_SELECT_IPA     {1 << 1)
+                       __u64   capabilities;
+                       __u16   sve_vlen;
+                       __u8    ipa_size;
+               } arm64;
+               __u64   dummy[15];
+       };
+};
+
  #define KVMIO 0xAE

  /* machine type bits, to be used as argument to KVM_CREATE_VM */

Other architectures could fill in their own bits if they need to.

Thoughts?

This kind of thing should work, but it may still get messy when we
add additional fields.


Marc, Dave,

I like Dave's approach. Some comments below.


It we want this to work cross-arch, would it make sense to go
for a more generic approach, say

struct kvm_create_vm_attr_any {
         __u32   type;
};

#define KVM_CREATE_VM_ATTR_ARCH_CAPABILITIES 1
struct kvm_create_vm_attr_arch_capabilities {
         __u32   type;
         __u16   size; /* support future expansion of capabilities[] */
         __u16   reserved;
         __u64   capabilities[1];
};

We also need to advertise which attributes are supported by the host,
so that the user can tune the available ones. That would make a bit mask
like the above trickier, unless we return the supported values back
in the argument ptr for the "probe" call. And this scheme in general
can be useful for passing back a non-boolean result specific to the
attribute, without having a per-attribute ioctl. (e.g, maximum limit
for IPA).

Maybe, but this could quickly become bloated.  (My approach already
feels a bit bloated...)

I'm not sure that arbitrarily complex negotiation will really be
needed, but userspace might want to change its mind if setting a
particular propertiy fails.

An alternative might be to have a bunch of per-VM ioctls to configure
different things, like x86 has.  There's at least precedent for that.
For arm, we currently only have a few.  That allows for easy extension,
at the cost of adding ioctls.

As you know, one of the major problems with the per-VM ioctls is
the ordering of different operations and tracking to make sure that
the userspace follows the expected order. e.g, the first approach for
IPA series was based on this and it made things complex enough to drop
it.


There may be some ioctls we can reuse, like KVM_ENABLE_CAP for per-
vm capability flags.

May be we could switch to KVM_VM_CAPS and pass a list of capabilities
to be enabled at creation time ? The kvm_enable_cap can pass in additional
arguments for each cap. That way we don't have to rely on a new set of
attributes and probing becomes straight forward.

Suzuki

Reply via email to