Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-14 Thread Avi Kivity
On 02/10/2012 07:16 PM, Marcelo Tosatti wrote: > On Thu, Feb 09, 2012 at 04:25:36PM +0200, Avi Kivity wrote: > > On 02/08/2012 08:45 PM, Marcelo Tosatti wrote: > > > > BTW do we really need fast slot creation/destruction? > > > > > > At the moment yes. Boot a RHEL/Fedora installation disk (or any o

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-14 Thread Avi Kivity
On 02/10/2012 03:25 PM, Takuya Yoshikawa wrote: > Avi Kivity wrote: > > > > 2. When we create(and shift?) a memory slot, we call > > > kvm_arch_flush_shadow() > > > to clear all mmio sptes, again not restricted to that slot. > > > > > > /* > > >* If the new memory slot is created, we need t

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-10 Thread Marcelo Tosatti
On Fri, Feb 10, 2012 at 10:08:12PM +0900, Takuya Yoshikawa wrote: > Avi Kivity wrote: > > > On 02/09/2012 04:23 PM, Avi Kivity wrote: > > > > BTW do we really need fast slot creation/destruction? > > > > > > Not really, but it's good to have infrastructure that copes with > > > different workload

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-10 Thread Marcelo Tosatti
On Thu, Feb 09, 2012 at 04:25:36PM +0200, Avi Kivity wrote: > On 02/08/2012 08:45 PM, Marcelo Tosatti wrote: > > > BTW do we really need fast slot creation/destruction? > > > > At the moment yes. Boot a RHEL/Fedora installation disk (or any other > > guest which uses SYSLINUX splash screen) and you

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-10 Thread Takuya Yoshikawa
Avi Kivity wrote: > > 2. When we create(and shift?) a memory slot, we call kvm_arch_flush_shadow() > > to clear all mmio sptes, again not restricted to that slot. > > > > /* > > * If the new memory slot is created, we need to clear all > > * mmio sptes. > > */ > > if (npage

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-10 Thread Takuya Yoshikawa
Avi Kivity wrote: > On 02/09/2012 04:23 PM, Avi Kivity wrote: > > > BTW do we really need fast slot creation/destruction? > > > > Not really, but it's good to have infrastructure that copes with > > different workloads. If the patches keep the code simple I think it's a > > good thing to have.

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-09 Thread Avi Kivity
On 02/08/2012 08:45 PM, Marcelo Tosatti wrote: > > BTW do we really need fast slot creation/destruction? > > At the moment yes. Boot a RHEL/Fedora installation disk (or any other > guest which uses SYSLINUX splash screen) and you will see. Another workload that suffers is Windows XP clearing the s

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-09 Thread Avi Kivity
On 02/09/2012 04:23 PM, Avi Kivity wrote: > > BTW do we really need fast slot creation/destruction? > > Not really, but it's good to have infrastructure that copes with > different workloads. If the patches keep the code simple I think it's a > good thing to have. To qualify - taking several tens

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-09 Thread Avi Kivity
On 02/08/2012 05:43 PM, Takuya Yoshikawa wrote: > [Dropped non-kvm members from cc] > > Marcelo Tosatti wrote: > > > VGABIOS mode constantly destroys and creates 0xa slot, so > > performance is required for KVM_SET_MEM too (it can probably be fixed in > > qemu, but older qemu's must be support

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-09 Thread Takuya Yoshikawa
On Wed, 8 Feb 2012 16:45:31 -0200 Marcelo Tosatti wrote: > > For 3: I think doing both "write protection" and "shadow flush" is > > unnecessary. > > If you enable dirty logging on a slot, certainly you have to write > protect? When we enable dirty logging, yes. > > > BTW do we really need f

Re: [RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-08 Thread Marcelo Tosatti
On Thu, Feb 09, 2012 at 12:43:20AM +0900, Takuya Yoshikawa wrote: > [Dropped non-kvm members from cc] > > Marcelo Tosatti wrote: > > > VGABIOS mode constantly destroys and creates 0xa slot, so > > performance is required for KVM_SET_MEM too (it can probably be fixed in > > qemu, but older qe

[RFC] need to improve slot creation/destruction? -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-08 Thread Takuya Yoshikawa
[Dropped non-kvm members from cc] Marcelo Tosatti wrote: > VGABIOS mode constantly destroys and creates 0xa slot, so > performance is required for KVM_SET_MEM too (it can probably be fixed in > qemu, but older qemu's must be supported). Apart from srcu, I see some problems concerning slot c

Re: [test result] dirty logging without srcu update -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-02 Thread Avi Kivity
On 02/02/2012 04:44 PM, Takuya Yoshikawa wrote: > Avi Kivity wrote: > > > > I have one concern about correctness issue though: > > > > > > concurrent rmap write protection may not be safe due to > > > delayed tlb flush ... cannot happen? > > > > What do you mean by concurrent rmap write p

Re: [test result] dirty logging without srcu update -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-02 Thread Takuya Yoshikawa
Avi Kivity wrote: > > I have one concern about correctness issue though: > > > > concurrent rmap write protection may not be safe due to > > delayed tlb flush ... cannot happen? > > What do you mean by concurrent rmap write protection? > Not sure, but other codes like: - mmu_sync_chil

Re: [test result] dirty logging without srcu update -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-02 Thread Avi Kivity
On 02/02/2012 12:40 PM, Takuya Yoshikawa wrote: > > I have one concern about correctness issue though: > > concurrent rmap write protection may not be safe due to > delayed tlb flush ... cannot happen? What do you mean by concurrent rmap write protection? -- error compiling committee.c:

Re: [test result] dirty logging without srcu update -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-02 Thread Takuya Yoshikawa
(2012/02/02 19:21), Avi Kivity wrote: I used "unsigned int" just because I wanted to use the current atomic_clear_mask() as is. We need to implement atomic_clear_mask_long() or use ... If we use cmpxchg8b/cmpxchg16b then this won't fit with the atomic_*_long family. OK, I will try. I ha

Re: [test result] dirty logging without srcu update -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-02 Thread Avi Kivity
On 02/02/2012 12:21 PM, Takuya Yoshikawa wrote: > (2012/02/02 19:10), Avi Kivity wrote: > >>> >>> = >>> # of dirty pages: kvm.git (ns), with this patch (ns) >>> 1: 102,077 ns 10,105 ns >>> 2: 47,197 ns 9,395 ns >>

Re: [test result] dirty logging without srcu update -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-02 Thread Takuya Yoshikawa
(2012/02/02 19:10), Avi Kivity wrote: = # of dirty pages: kvm.git (ns), with this patch (ns) 1: 102,077 ns 10,105 ns 2: 47,197 ns 9,395 ns 4: 43,563 ns 9,938 ns 8: 41,239 ns 10,618

Re: [test result] dirty logging without srcu update -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-02 Thread Avi Kivity
On 02/02/2012 07:46 AM, Takuya Yoshikawa wrote: > Avi Kivity wrote: > > > >> That'll be great, numbers are better than speculation. > > >> > > > > > > > > > Yes, I already have some good numbers to show (and some patches). > > > > Looking forward. > > I made a patch to see if Avi's suggestion of

[test result] dirty logging without srcu update -- Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Takuya Yoshikawa
Avi Kivity wrote: > >> That'll be great, numbers are better than speculation. > >> > > > > > > Yes, I already have some good numbers to show (and some patches). > > Looking forward. I made a patch to see if Avi's suggestion of getting rid of srcu update for dirty logging is practical; tested w

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Takuya Yoshikawa
On Wed, 1 Feb 2012 11:43:47 -0200 Marcelo Tosatti wrote: > > > I can show you some performance numbers, this weekend, if you like. > > > > That'll be great, numbers are better than speculation. > > get dirty log:5634134 ns for 262144 dirty pages > > 5ms (for the entire operation). >

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Paul E. McKenney
On Wed, Feb 01, 2012 at 11:22:29AM +0100, Peter Zijlstra wrote: > On Tue, 2012-01-31 at 14:24 -0800, Paul E. McKenney wrote: > > > > > Can we get it back to speed by scheduling a work function on all cpus? > > > > wouldn't that force a quiescent state and allow call_srcu() to fire? > > > > > > >

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Marcelo Tosatti
On Wed, Feb 01, 2012 at 01:01:38PM +0200, Avi Kivity wrote: > On 02/01/2012 01:00 PM, Takuya Yoshikawa wrote: > > > >> rcu_assign_pointer), and use atomic operations to copy and clear: > >> > >>word = bitmap[i] > >>put_user(word) > >>atomic_and(&bitmap[i], ~word) > >> > >> > > > > This

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Marcelo Tosatti
On Wed, Feb 01, 2012 at 12:49:57PM +0200, Avi Kivity wrote: > On 02/01/2012 12:44 PM, Avi Kivity wrote: > > On 02/01/2012 12:22 PM, Peter Zijlstra wrote: > > > One of the things I was thinking of is adding a sequence counter in the > > > per-cpu data. Using that we could do something like: > > > >

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Avi Kivity
On 02/01/2012 01:12 PM, Takuya Yoshikawa wrote: > (2012/02/01 20:01), Avi Kivity wrote: >> On 02/01/2012 01:00 PM, Takuya Yoshikawa wrote: > >>> How about just doing: >>> >>> take a spin_lock >>> copy the entire (or some portions of) bitmap locally >>> clear the bitmap >>> unlock >>> >> >> That mea

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Takuya Yoshikawa
(2012/02/01 20:01), Avi Kivity wrote: On 02/01/2012 01:00 PM, Takuya Yoshikawa wrote: How about just doing: take a spin_lock copy the entire (or some portions of) bitmap locally clear the bitmap unlock That means that vcpus dirtying memory also have to take that lock, and spin while the bi

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Avi Kivity
On 02/01/2012 01:00 PM, Takuya Yoshikawa wrote: > >> rcu_assign_pointer), and use atomic operations to copy and clear: >> >>word = bitmap[i] >>put_user(word) >>atomic_and(&bitmap[i], ~word) >> >> > > This kind of this was really slow IIRC. > > > How about just doing: > > take a spin_loc

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Takuya Yoshikawa
(2012/02/01 19:49), Avi Kivity wrote: On 02/01/2012 12:44 PM, Avi Kivity wrote: On 02/01/2012 12:22 PM, Peter Zijlstra wrote: One of the things I was thinking of is adding a sequence counter in the per-cpu data. Using that we could do something like: unsigned int seq1 = 0, seq2 = 0, count =

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Avi Kivity
On 02/01/2012 12:44 PM, Avi Kivity wrote: > On 02/01/2012 12:22 PM, Peter Zijlstra wrote: > > One of the things I was thinking of is adding a sequence counter in the > > per-cpu data. Using that we could do something like: > > > > unsigned int seq1 = 0, seq2 = 0, count = 0; > > int cpu, idx; >

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Avi Kivity
On 02/01/2012 12:22 PM, Peter Zijlstra wrote: > One of the things I was thinking of is adding a sequence counter in the > per-cpu data. Using that we could do something like: > > unsigned int seq1 = 0, seq2 = 0, count = 0; > int cpu, idx; > > idx = ACCESS_ONCE(sp->completions) & 1; > > for_

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-02-01 Thread Peter Zijlstra
On Tue, 2012-01-31 at 14:24 -0800, Paul E. McKenney wrote: > > > Can we get it back to speed by scheduling a work function on all cpus? > > > wouldn't that force a quiescent state and allow call_srcu() to fire? > > > > > > In kvm's use case synchronize_srcu_expedited() is usually called when no

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-01-31 Thread Paul E. McKenney
On Tue, Jan 31, 2012 at 02:50:07PM +0100, Peter Zijlstra wrote: > On Tue, 2012-01-31 at 15:47 +0200, Avi Kivity wrote: > > > They really need to return quickly to userspace, and they really need to > > perform some operation between rcu_assign_pointer() and returning, so no. > > Bugger :/ > > >

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-01-31 Thread Peter Zijlstra
On Tue, 2012-01-31 at 15:47 +0200, Avi Kivity wrote: > They really need to return quickly to userspace, and they really need to > perform some operation between rcu_assign_pointer() and returning, so no. Bugger :/ > > > > Compile tested only!! :-) > > > > How much did synchronize_srcu_expedited

Re: [RFC][PATCH] srcu: Implement call_srcu()

2012-01-31 Thread Avi Kivity
On 01/31/2012 03:32 PM, Peter Zijlstra wrote: > Subject: srcu: Implement call_srcu() > From: Peter Zijlstra > Date: Mon Jan 30 23:20:49 CET 2012 > > Implement call_srcu() by using a state machine driven by > call_rcu_sched() and timer callbacks. > > The state machine is a direct derivation of the