Andrea Arcangeli wrote:

One day we want to sort the slots according to size. We'll need better locking then (rcu, likely).

I think it's more interesting to sort them on their start/end gfn
address, prio_tree would be the optimal structure for me to use in the
mmu notifier invalidates as I could ask the prio tree to show me in
O(log(N)) (N number of slots) all the slots that overlap with the
invalidated start/end range. However I'm afraid there aren't enough
slots for this to matter... but OTOH there aren't frequent
modifications either, so it may be a microoptimization (if there were
frequent modifications with such a small number of slots it likely
would be slower than a list).

There are only two interesting slots, 1G-end_of_mem and 0-pci_hole. A sorted array gives an average of less than two lookups to find the slot, with the smallest possible constant. Any other algorithm will give more lookups with a far higher constant.

Sometimes linear search is the best algorithm.


  @@ -1235,9 +1340,9 @@ static void mmu_free_roots(struct kvm_vcpu *vcpu)
        int i;
        struct kvm_mmu_page *sp;
 -      if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
-               return;
        spin_lock(&vcpu->kvm->mmu_lock);
+       if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
+               goto out;
vcpu-> stuff is protected by the vcpu mutex.

That was to run it in ->release but I'm undoing as I'm removing release.


Surely ->release can't happen while vcpus are still running?

Please move the registration to virt/kvm/kvm_main.c, and provide stubs for non-x86. This is definitely something that we want to do cross-arch (except s390 which have it in hardware).

Sorry but I did this originally in kvm/kvm_main.c and I got complains
from s390 that didn't want the common code polluted with non-s390
needed stuff. so what should I do?


x86, ia64, and ppc all need mmu notifiers as well as other new architectures, so this better be in common code. We can #ifdef it so s390 doesn't suffer a performance impact.

+
+static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp)
+{
+       u64 *spte;
+       int young = 0;
+
+       spte = rmap_next(kvm, rmapp, NULL);
+       while (spte) {
+               int _young;
+               u64 _spte = *spte;
+               BUG_ON(!(_spte & PT_PRESENT_MASK));
+               _young = _spte & PT_ACCESSED_MASK;

I forgot, last round: EPT doesn't have PT_ACCESSED_MASK, so you're reading (and clearing) some random bit. We can easily qualify the test by looking at shadow_accessed_mask; if zero, the hardware doesn't have a young bit.

There's a bigger question of what to do in the case of EPT. We can always return true, or return true and also unmap. The first means we lie to the vm, the second means we take unnecessary exits to maintain age information.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to