Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-02-26 Thread Andy Lutomirski
On Thu, Jan 8, 2015 at 2:43 PM, Andy Lutomirski wrote: > On Thu, Jan 8, 2015 at 2:31 PM, Marcelo Tosatti wrote: >> On Tue, Jan 06, 2015 at 11:49:09AM -0800, Andy Lutomirski wrote: >>> On Tue, Jan 6, 2015 at 10:45 AM, Marcelo Tosatti >>> wrote: >>> > On Tue, Jan 06, 2015 at 10:26:22AM -0800, And

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-08 Thread Andy Lutomirski
On Thu, Jan 8, 2015 at 2:31 PM, Marcelo Tosatti wrote: > On Tue, Jan 06, 2015 at 11:49:09AM -0800, Andy Lutomirski wrote: >> On Tue, Jan 6, 2015 at 10:45 AM, Marcelo Tosatti wrote: >> > On Tue, Jan 06, 2015 at 10:26:22AM -0800, Andy Lutomirski wrote: >> >> On Tue, Jan 6, 2015 at 10:13 AM, Marcelo

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-08 Thread Marcelo Tosatti
On Tue, Jan 06, 2015 at 11:49:09AM -0800, Andy Lutomirski wrote: > On Tue, Jan 6, 2015 at 10:45 AM, Marcelo Tosatti wrote: > > On Tue, Jan 06, 2015 at 10:26:22AM -0800, Andy Lutomirski wrote: > >> On Tue, Jan 6, 2015 at 10:13 AM, Marcelo Tosatti > >> wrote: > >> > On Tue, Jan 06, 2015 at 08:56:4

Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-08 Thread David Vrabel
On 23/12/2014 00:39, Andy Lutomirski wrote: The pvclock vdso code was too abstracted to understand easily and excessively paranoid. Simplify it for a huge speedup. This opens the door for additional simplifications, as the vdso no longer accesses the pvti for any vcpu other than vcpu 0. Before

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-07 Thread Marcelo Tosatti
On Tue, Jan 06, 2015 at 11:18:21PM -0800, Andy Lutomirski wrote: > On Tue, Jan 6, 2015 at 9:38 PM, Paolo Bonzini wrote: > > > > > > On 06/01/2015 17:56, Andy Lutomirski wrote: > >> Still no good. We can migrate a bunch of times so we see the same CPU > >> all three times > > > > There are no thre

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-07 Thread Paolo Bonzini
On 07/01/2015 08:18, Andy Lutomirski wrote: >>> >> Thus far, I've been told unambiguously that a guest can't observe pvti >>> >> while it's being written, and I think you're now telling me that this >>> >> isn't true and that a guest *can* observe pvti while it's being >>> >> written while the lo

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Andy Lutomirski
On Tue, Jan 6, 2015 at 9:38 PM, Paolo Bonzini wrote: > > > On 06/01/2015 17:56, Andy Lutomirski wrote: >> Still no good. We can migrate a bunch of times so we see the same CPU >> all three times > > There are no three times. The CPU you see here: > >>> >>> >>> // ... compute nanoseconds from

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Paolo Bonzini
On 06/01/2015 19:26, Andy Lutomirski wrote: > Don't you stil need: > > version++; > write the rest; > version++; > > with possible smp_wmb() in there to keep the compiler from messing around? No, see my other reply. Separating the version write is a real bug, but that should be all that it's

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Paolo Bonzini
On 06/01/2015 17:56, Andy Lutomirski wrote: > Still no good. We can migrate a bunch of times so we see the same CPU > all three times There are no three times. The CPU you see here: >> >> >> // ... compute nanoseconds from pvti and tsc ... >> rmb(); >> } while(v != pvti->version);

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Andy Lutomirski
On Tue, Jan 6, 2015 at 12:20 PM, Marcelo Tosatti wrote: > On Tue, Jan 06, 2015 at 11:49:09AM -0800, Andy Lutomirski wrote: >> > What is the point with the new flags bit though? >> >> To try to work around the problem on old hosts. I'm not at all >> convinced that this is worthwhile or that it hel

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Marcelo Tosatti
On Tue, Jan 06, 2015 at 11:49:09AM -0800, Andy Lutomirski wrote: > > What is the point with the new flags bit though? > > To try to work around the problem on old hosts. I'm not at all > convinced that this is worthwhile or that it helps, though. Don't think so. Just fix the host bug. > >> Also

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Andy Lutomirski
On Tue, Jan 6, 2015 at 10:45 AM, Marcelo Tosatti wrote: > On Tue, Jan 06, 2015 at 10:26:22AM -0800, Andy Lutomirski wrote: >> On Tue, Jan 6, 2015 at 10:13 AM, Marcelo Tosatti wrote: >> > On Tue, Jan 06, 2015 at 08:56:40AM -0800, Andy Lutomirski wrote: >> >> On Jan 6, 2015 4:01 AM, "Paolo Bonzini"

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Marcelo Tosatti
On Tue, Jan 06, 2015 at 10:26:22AM -0800, Andy Lutomirski wrote: > On Tue, Jan 6, 2015 at 10:13 AM, Marcelo Tosatti wrote: > > On Tue, Jan 06, 2015 at 08:56:40AM -0800, Andy Lutomirski wrote: > >> On Jan 6, 2015 4:01 AM, "Paolo Bonzini" wrote: > >> > > >> > > >> > > >> > On 06/01/2015 09:42, Paol

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Andy Lutomirski
On Tue, Jan 6, 2015 at 10:13 AM, Marcelo Tosatti wrote: > On Tue, Jan 06, 2015 at 08:56:40AM -0800, Andy Lutomirski wrote: >> On Jan 6, 2015 4:01 AM, "Paolo Bonzini" wrote: >> > >> > >> > >> > On 06/01/2015 09:42, Paolo Bonzini wrote: >> > > > > Still confused. So we can freeze all vCPUs in the

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Marcelo Tosatti
On Tue, Jan 06, 2015 at 08:56:40AM -0800, Andy Lutomirski wrote: > On Jan 6, 2015 4:01 AM, "Paolo Bonzini" wrote: > > > > > > > > On 06/01/2015 09:42, Paolo Bonzini wrote: > > > > > Still confused. So we can freeze all vCPUs in the host, then update > > > > > pvti 1, then resume vCPU 1, then upda

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Andy Lutomirski
On Jan 6, 2015 4:01 AM, "Paolo Bonzini" wrote: > > > > On 06/01/2015 09:42, Paolo Bonzini wrote: > > > > Still confused. So we can freeze all vCPUs in the host, then update > > > > pvti 1, then resume vCPU 1, then update pvti 0? In that case, we have > > > > a problem, because vCPU 1 can observe

Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Konrad Rzeszutek Wilk
On Mon, Jan 05, 2015 at 10:56:07AM -0800, Andy Lutomirski wrote: > On Mon, Jan 5, 2015 at 7:25 AM, Marcelo Tosatti wrote: > > On Mon, Dec 22, 2014 at 04:39:57PM -0800, Andy Lutomirski wrote: > >> The pvclock vdso code was too abstracted to understand easily and > >> excessively paranoid. Simplify

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Paolo Bonzini
On 06/01/2015 09:42, Paolo Bonzini wrote: > > > Still confused. So we can freeze all vCPUs in the host, then update > > > pvti 1, then resume vCPU 1, then update pvti 0? In that case, we have > > > a problem, because vCPU 1 can observe pvti 0 mid-update, and KVM > > > doesn't increment the vers

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Paolo Bonzini
On 05/01/2015 23:48, Marcelo Tosatti wrote: >>> > > But there is no guarantee that vCPU-N has updated its pvti when >>> > > vCPU-M resumes guest instruction execution. >> > >> > Still confused. So we can freeze all vCPUs in the host, then update >> > pvti 1, then resume vCPU 1, then update pvti

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-06 Thread Paolo Bonzini
On 05/01/2015 20:17, Marcelo Tosatti wrote: > But there is no guarantee that vCPU-N has updated its pvti when > vCPU-M resumes guest instruction execution. You're right. > So the cost this patch removes is mainly from __getcpu (==RDTSCP?) ? > Perhaps you can use Gleb's idea to stick vcpu id int

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-05 Thread Andy Lutomirski
On Mon, Jan 5, 2015 at 2:48 PM, Marcelo Tosatti wrote: > On Mon, Jan 05, 2015 at 02:38:46PM -0800, Andy Lutomirski wrote: >> On Mon, Jan 5, 2015 at 11:17 AM, Marcelo Tosatti wrote: >> > On Mon, Jan 05, 2015 at 10:56:07AM -0800, Andy Lutomirski wrote: >> >> On Mon, Jan 5, 2015 at 7:25 AM, Marcelo

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-05 Thread Marcelo Tosatti
On Mon, Jan 05, 2015 at 02:38:46PM -0800, Andy Lutomirski wrote: > On Mon, Jan 5, 2015 at 11:17 AM, Marcelo Tosatti wrote: > > On Mon, Jan 05, 2015 at 10:56:07AM -0800, Andy Lutomirski wrote: > >> On Mon, Jan 5, 2015 at 7:25 AM, Marcelo Tosatti > >> wrote: > >> > On Mon, Dec 22, 2014 at 04:39:57

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-05 Thread Andy Lutomirski
On Mon, Jan 5, 2015 at 11:17 AM, Marcelo Tosatti wrote: > On Mon, Jan 05, 2015 at 10:56:07AM -0800, Andy Lutomirski wrote: >> On Mon, Jan 5, 2015 at 7:25 AM, Marcelo Tosatti wrote: >> > On Mon, Dec 22, 2014 at 04:39:57PM -0800, Andy Lutomirski wrote: >> >> The pvclock vdso code was too abstracted

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-05 Thread Marcelo Tosatti
On Mon, Jan 05, 2015 at 10:56:07AM -0800, Andy Lutomirski wrote: > On Mon, Jan 5, 2015 at 7:25 AM, Marcelo Tosatti wrote: > > On Mon, Dec 22, 2014 at 04:39:57PM -0800, Andy Lutomirski wrote: > >> The pvclock vdso code was too abstracted to understand easily and > >> excessively paranoid. Simplify

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-05 Thread Paolo Bonzini
On 05/01/2015 19:56, Andy Lutomirski wrote: >> > 1) State: all pvtis marked as PVCLOCK_TSC_STABLE_BIT. >> > 1) Update request for all vcpus, for a TSC_STABLE_BIT -> ~TSC_STABLE_BIT >> > transition. >> > 2) vCPU-1 updates its pvti with new values. >> > 3) vCPU-0 still has not updated its pvti with

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-05 Thread Andy Lutomirski
On Mon, Jan 5, 2015 at 7:25 AM, Marcelo Tosatti wrote: > On Mon, Dec 22, 2014 at 04:39:57PM -0800, Andy Lutomirski wrote: >> The pvclock vdso code was too abstracted to understand easily and >> excessively paranoid. Simplify it for a huge speedup. >> >> This opens the door for additional simplifi

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-01-05 Thread Marcelo Tosatti
On Mon, Dec 22, 2014 at 04:39:57PM -0800, Andy Lutomirski wrote: > The pvclock vdso code was too abstracted to understand easily and > excessively paranoid. Simplify it for a huge speedup. > > This opens the door for additional simplifications, as the vdso no > longer accesses the pvti for any vc

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2014-12-24 Thread Andy Lutomirski
On Wed, Dec 24, 2014 at 1:30 PM, David Matlack wrote: > On Mon, Dec 22, 2014 at 4:39 PM, Andy Lutomirski wrote: >> The pvclock vdso code was too abstracted to understand easily and >> excessively paranoid. Simplify it for a huge speedup. >> >> This opens the door for additional simplifications,

Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2014-12-24 Thread David Matlack
On Mon, Dec 22, 2014 at 4:39 PM, Andy Lutomirski wrote: > The pvclock vdso code was too abstracted to understand easily and > excessively paranoid. Simplify it for a huge speedup. > > This opens the door for additional simplifications, as the vdso no > longer accesses the pvti for any vcpu other

Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2014-12-23 Thread Boris Ostrovsky
On 12/23/2014 10:14 AM, Paolo Bonzini wrote: On 23/12/2014 16:14, Boris Ostrovsky wrote: +do { +version = pvti->version; + +/* This is also a read barrier, so we'll read version first. */ +rdtsc_barrier(); +tsc = __native_read_tsc(); This will cause VMEXIT

Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2014-12-23 Thread Paolo Bonzini
On 23/12/2014 16:14, Boris Ostrovsky wrote: >> +do { >> +version = pvti->version; >> + >> +/* This is also a read barrier, so we'll read version first. */ >> +rdtsc_barrier(); >> +tsc = __native_read_tsc(); > > > This will cause VMEXIT on Xen with TSC_MODE_AL

Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2014-12-23 Thread Boris Ostrovsky
On 12/22/2014 07:39 PM, Andy Lutomirski wrote: The pvclock vdso code was too abstracted to understand easily and excessively paranoid. Simplify it for a huge speedup. This opens the door for additional simplifications, as the vdso no longer accesses the pvti for any vcpu other than vcpu 0. Bef

Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2014-12-23 Thread David Vrabel
On 23/12/14 00:39, Andy Lutomirski wrote: > The pvclock vdso code was too abstracted to understand easily and > excessively paranoid. Simplify it for a huge speedup. > > This opens the door for additional simplifications, as the vdso no > longer accesses the pvti for any vcpu other than vcpu 0. >

[RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2014-12-22 Thread Andy Lutomirski
The pvclock vdso code was too abstracted to understand easily and excessively paranoid. Simplify it for a huge speedup. This opens the door for additional simplifications, as the vdso no longer accesses the pvti for any vcpu other than vcpu 0. Before, vclock_gettime using kvm-clock took about 64