Re: mmu_notifiers: turn off lockdep around mm_take_all_locks

2009-07-07 Thread Peter Zijlstra
On Tue, 2009-07-07 at 15:06 -0300, Marcelo Tosatti wrote: > KVM guests with CONFIG_LOCKDEP=y trigger the following warning: > > BUG: MAX_LOCK_DEPTH too low! > turning off the locking correctness validator. > Pid: 4624, comm: qemu-system-x86 Not tainted 2.6.31-rc2-03981-g3abaf21 > #32 > Call Trace:

Re: mmu_notifiers: turn off lockdep around mm_take_all_locks

2009-07-07 Thread Peter Zijlstra
On Tue, 2009-07-07 at 15:37 -0300, Marcelo Tosatti wrote: > >>> > >>> Is there any way around this other than completly shutting down lockdep? > >>> > >> > >> When we created this the promise was that kvm would only do this on a > >> fresh mm with only a few vmas, has that changed > > > > The

Re: mmu_notifiers: turn off lockdep around mm_take_all_locks

2009-07-07 Thread Peter Zijlstra
On Tue, 2009-07-07 at 12:25 -0700, Linus Torvalds wrote: > > On Tue, 7 Jul 2009, Peter Zijlstra wrote: > > > > Another issue, at about >=256 vmas we'll overflow the preempt count. So > > disabling lockdep will only 'fix' this for a short while, until

Re: [PATCH 6/7] KVM-GST: adjust scheduler cpu power

2011-06-20 Thread Peter Zijlstra
On Tue, 2011-06-14 at 22:26 -0300, Glauber Costa wrote: > On 06/14/2011 07:42 AM, Peter Zijlstra wrote: > > On Mon, 2011-06-13 at 19:31 -0400, Glauber Costa wrote: > >> @@ -1981,12 +1987,29 @@ static void update_rq_clock_task(struct rq > >> *rq, s64 delta) > >

Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-29 Thread Peter Zijlstra
On Wed, 2011-06-29 at 10:52 +0300, Avi Kivity wrote: > On 06/13/2011 04:34 PM, Avi Kivity wrote: > > This patchset exposes an emulated version 1 architectural performance > > monitoring unit to KVM guests. The PMU is emulated using perf_events, > > so the host kernel can multiplex host-wide, host-

Re: [PATCH 0/5] perf support for amd guest/host-only bits v2

2011-06-29 Thread Peter Zijlstra
On Tue, 2011-06-28 at 18:10 +0200, Joerg Roedel wrote: > On Fri, Jun 17, 2011 at 03:37:29PM +0200, Joerg Roedel wrote: > > this is the second version of the patch-set to support the AMD > > guest-/host only bits in the performance counter MSRs. Due to lack of > > time I havn't looked into emulating

Re: [PATCH v3 8/9] KVM-GST: adjust scheduler cpu power

2011-06-30 Thread Peter Zijlstra
On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > + return __touch_steal_time(is_idle, UINT_MAX, NULL); That wants to be ULLONG_MAX, because max_steal is a u64, with UINT_MAX the comparison: + if (steal > max_steal) Isn't true per-se and the compiler cannot optimize t

Re: [PATCH v3 8/9] KVM-GST: adjust scheduler cpu power

2011-06-30 Thread Peter Zijlstra
On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > +#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING > + if (static_branch((¶virt_steal_rq_enabled))) { > + int is_idle; > + u64 st; > + > + is_idle = ((rq->curr != rq->idle) || > +

Re: [PATCH v3 8/9] KVM-GST: adjust scheduler cpu power

2011-06-30 Thread Peter Zijlstra
On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > @@ -2063,12 +2092,7 @@ static int irqtime_account_si_update(void) > > #define sched_clock_irqtime(0) > > -static void update_rq_clock_task(struct rq *rq, s64 delta) > -{ > - rq->clock_task += delta; > -} > - > -#endif /* CONF

Re: [PATCH v3 7/9] KVM-GST: KVM Steal time accounting

2011-06-30 Thread Peter Zijlstra
On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > This patch accounts steal time time in kernel/sched. > I kept it from last proposal, because I still see advantages > in it: Doing it here will give us easier access from scheduler > variables such as the cpu rq. The next patch shows an exam

Re: [PATCH v3 7/9] KVM-GST: KVM Steal time accounting

2011-06-30 Thread Peter Zijlstra
On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > + if (static_branch(¶virt_steal_enabled)) { How is that going to compile on !CONFIG_PARAVIRT or !x86 in general? Only x86-PARAVIRT will provide that variable. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the

Re: [PATCH v3 7/9] KVM-GST: KVM Steal time accounting

2011-06-30 Thread Peter Zijlstra
On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > +static inline u64 steal_ticks(u64 steal) > +{ > + if (unlikely(steal > NSEC_PER_SEC)) > + return steal / TICK_NSEC; That won't compile on a number of 32bit architecture, use div_u64 or something similar. > + > +

Re: [PATCH v3 3/9] KVM-HDR: KVM Steal time implementation

2011-06-30 Thread Peter Zijlstra
On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > + version: guest has to check version before and after grabbing > + time information and check that they are both equal and even. > + An odd version indicates an in-progress update. That's generall

Re: [PATCH v3 7/9] KVM-GST: KVM Steal time accounting

2011-06-30 Thread Peter Zijlstra
On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > +static noinline bool touch_steal_time(int is_idle) That noinline is very unlucky there, > +{ > + u64 steal, st = 0; > + > + if (static_branch(¶virt_steal_enabled)) { > + > + steal = paravirt_steal_clock(smp_proce

Re: [PATCH v3 7/9] KVM-GST: KVM Steal time accounting

2011-07-01 Thread Peter Zijlstra
On Thu, 2011-06-30 at 23:50 -0300, Glauber Costa wrote: > I was under the impression that the proper use of jump labels required > each label to be tied to a single location. If we make it inline, the > same key would point to multiple locations, and we would have trouble > altering all of the lo

Re: [PATCH v3 7/9] KVM-GST: KVM Steal time accounting

2011-07-01 Thread Peter Zijlstra
On Thu, 2011-06-30 at 23:53 -0300, Glauber Costa wrote: > On 06/30/2011 06:54 PM, Peter Zijlstra wrote: > > On Wed, 2011-06-29 at 11:29 -0400, Glauber Costa wrote: > >> + if (static_branch(¶virt_steal_enabled)) { > > > > How is that going to compile on !CONF

Re: [PATCH v4 8/9] KVM-GST: adjust scheduler cpu power

2011-07-02 Thread Peter Zijlstra
On Fri, 2011-07-01 at 17:22 -0400, Glauber Costa wrote: > @@ -1971,8 +1974,14 @@ static inline u64 steal_ticks(u64 steal) > > static void update_rq_clock_task(struct rq *rq, s64 delta) > { > - s64 irq_delta; > - > +/* > + * In theory, the compile should just see 0 here, and optimize out t

Re: [PATCH v4 7/9] KVM-GST: KVM Steal time accounting

2011-07-02 Thread Peter Zijlstra
On Fri, 2011-07-01 at 17:22 -0400, Glauber Costa wrote: > @@ -3929,6 +3945,23 @@ void account_process_tick(struct task_struct *p, int > user_tick) > return; > } > > +#ifdef CONFIG_PARAVIRT > + if (static_branch(¶virt_steal_enabled)) { > + u64 steal, st

Re: [PATCH v4 8/9] KVM-GST: adjust scheduler cpu power

2011-07-02 Thread Peter Zijlstra
d, > prev_steal_time_rq. This is because otherwise, information about time > accounted in update_process_tick() would never reach us in update_rq_clock(). > > Signed-off-by: Glauber Costa > CC: Rik van Riel > CC: Jeremy Fitzhardinge > CC: Peter Zijlstra > CC: Avi Kiv

Re: [PATCH v5 7/9] KVM-GST: KVM Steal time accounting

2011-07-05 Thread Peter Zijlstra
off-by: Glauber Costa > CC: Rik van Riel > CC: Jeremy Fitzhardinge Acked-by: Peter Zijlstra Venki, can you have a look at that irqtime_account_process_tick(), I think adding the steal time up front like this is fine, because it suffers from the same 'problem' as both irqtime thin

Re: [PATCH v5 8/9] KVM-GST: adjust scheduler cpu power

2011-07-05 Thread Peter Zijlstra
d, > prev_steal_time_rq. This is because otherwise, information about time > accounted in update_process_tick() would never reach us in > update_rq_clock(). > > Signed-off-by: Glauber Costa > CC: Rik van Riel > CC: Jeremy Fitzhardinge Acked-by: Peter Zijlstra -- To unsub

Re: [PATCH 1/3] perf: add context field to perf_event

2011-07-12 Thread Peter Zijlstra
On Tue, 2011-07-12 at 10:20 +0300, Avi Kivity wrote: > Maybe we need a generic "run this function in this task's context" > mechanism instead. Like an IPI, but targeting tasks instead of cpus. > kernel/event/core.c:task_function_call() like? -- To unsubscribe from this list: send the line "uns

Re: [PATCH 1/3] perf: add context field to perf_event

2011-07-12 Thread Peter Zijlstra
On Tue, 2011-07-12 at 12:08 +0300, Avi Kivity wrote: > Similar, but with stronger guarantees: when the function is called, > current == p, and the task was either sleeping or in userspace. If the task is sleeping, current can never be p. -- To unsubscribe from this list: send the line "unsubscri

Re: [PATCH 1/3] perf: add context field to perf_event

2011-07-12 Thread Peter Zijlstra
On Tue, 2011-07-12 at 12:27 +0300, Avi Kivity wrote: > On 07/12/2011 12:18 PM, Peter Zijlstra wrote: > > > > > > The guarantee is that the task was sleeping just before the function is > > > called. Of course it's woken up to run the function. > > >

Re: [PATCH 1/3] perf: add context field to perf_event

2011-07-12 Thread Peter Zijlstra
On Tue, 2011-07-12 at 12:16 +0300, Avi Kivity wrote: > On 07/12/2011 12:14 PM, Peter Zijlstra wrote: > > On Tue, 2011-07-12 at 12:08 +0300, Avi Kivity wrote: > > > Similar, but with stronger guarantees: when the function is called, > > > current == p, and the tas

Re: perf uncore & lkvm woes

2012-08-16 Thread Peter Zijlstra
On Thu, 2012-08-16 at 10:01 +0300, Pekka Enberg wrote: > Has anyone seen this? It's kvmtool/next with 3.6.0-rc1. Looks like we > are doing uncore_init() on virtualized CPU which breaks boot. I think you're the first.. I don't normally use kvm if I can at all avoid it. But I think its a 'simple'

Re: perf uncore & lkvm woes

2012-08-16 Thread Peter Zijlstra
On Thu, 2012-08-16 at 14:06 +0300, Avi Kivity wrote: > Another option is to deal with them on the host side. That has the > benefit of working with non-Linux guests too. Right, its an insane amount of MSRs though, but it could be done if someone takes the time to enumerate them all. If KVM then

Re: perf uncore & lkvm woes

2012-08-16 Thread Peter Zijlstra
On Fri, 2012-08-17 at 09:40 +0800, Yan, Zheng wrote: > > Peter, do I need to submit a patch disables uncore on virtualized CPU? > I think Avi prefers the method where KVM 'fakes' the MSRs and we have to detect if the MSRs actually work or not. If you're willing to have a go at that, please do so

Re: perf uncore & lkvm woes

2012-08-21 Thread Peter Zijlstra
On Sun, 2012-08-19 at 12:55 +0300, Avi Kivity wrote: > > I think Avi prefers the method where KVM 'fakes' the MSRs and we have to > > detect if the MSRs actually work or not. > > s/we have/we don't have/. So for the 'normal' PMU we actually do check to see if the MSRs are being faked and bail if

Re: perf uncore & lkvm woes

2012-08-21 Thread Peter Zijlstra
On Tue, 2012-08-21 at 11:34 +0300, Avi Kivity wrote: > On 08/21/2012 10:11 AM, Peter Zijlstra wrote: > > On Sun, 2012-08-19 at 12:55 +0300, Avi Kivity wrote: > >> > I think Avi prefers the method where KVM 'fakes' the MSRs and we have to > >> > detect if

Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Peter Zijlstra
On Thu, 2011-09-01 at 17:55 -0700, Jeremy Fitzhardinge wrote: > From: Jeremy Fitzhardinge > > We need to make sure interrupts are disabled while we're relying on the > contents of the per-cpu lock_waiting values, otherwise an interrupt > handler could come in, try to take some other lock, block,

Re: [PATCH 10/13] xen/pvticket: allow interrupts to be enabled while blocking

2011-09-02 Thread Peter Zijlstra
On Thu, 2011-09-01 at 17:55 -0700, Jeremy Fitzhardinge wrote: > + /* Make sure an interrupt handler can't upset things in a > + partially setup state. */ > local_irq_save(flags); > > + /* > +* We don't really care if we're overwriting some other > +* (

Re: [PATCH 11/13] x86/ticketlock: only do kick after doing unlock

2011-09-02 Thread Peter Zijlstra
On Thu, 2011-09-01 at 17:55 -0700, Jeremy Fitzhardinge wrote: > From: Srivatsa Vaddagiri > > We must release the lock before checking to see if the lock is in > slowpath or else there's a potential race where the lock enters the > slow path after the unlocker has checked the slowpath flag, but be

Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Peter Zijlstra
On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote: > > > I know that its generally considered bad form, but there's at least one > > spinlock that's only taken from NMI context and thus hasn't got any > > deadlock potential. > > Which one? arch/x86/kernel/traps.c:nmi_reason_lock It

Re: [PATCH 08/13] xen/pvticketlock: disable interrupts while blocking

2011-09-02 Thread Peter Zijlstra
On Fri, 2011-09-02 at 14:50 -0700, Jeremy Fitzhardinge wrote: > On 09/02/2011 01:47 PM, Peter Zijlstra wrote: > > On Fri, 2011-09-02 at 12:29 -0700, Jeremy Fitzhardinge wrote: > >>> I know that its generally considered bad form, but there's at least one > >>&g

Re: [PATCH v2 6/9] perf, intel: Use GO/HO bits in perf-ctr

2011-10-05 Thread Peter Zijlstra
support them if need arise. Signed-off-by: Gleb Natapov Signed-off-by: Peter Zijlstra Link: http://lkml.kernel.org/r/1317816084-18026-7-git-send-email-g...@redhat.com --- arch/x86/include/asm/perf_event.h | 12 arch/x86/kernel/cpu/perf_event.h | 12 arch/x86/kernel/cpu

Re: [PATCH v2 8/9] KVM, VMX: Add support for guest/host-only profiling

2011-10-05 Thread Peter Zijlstra
On Wed, 2011-10-05 at 14:01 +0200, Gleb Natapov wrote: > +static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) > +{ > + int i, nr_msrs; > + struct perf_guest_switch_msr *msrs; > + > + msrs = perf_guest_get_msrs(&nr_msrs); > + > + if (!msrs) > + return; > +

Re: [PATCH v2 0/9] perf support for x86 guest/host-only bits

2011-10-05 Thread Peter Zijlstra
On Wed, 2011-10-05 at 15:48 +0200, Avi Kivity wrote: > On 10/05/2011 02:01 PM, Gleb Natapov wrote: > > This patch series consists of Joerg series named "perf support for amd > > guest/host-only bits v2" [1] rebased to 3.1.0-rc7 and in addition, > > support for intel cpus for the same functionality.

Re: [PATCH v2 8/9] KVM, VMX: Add support for guest/host-only profiling

2011-10-05 Thread Peter Zijlstra
On Wed, 2011-10-05 at 17:29 +0200, Gleb Natapov wrote: > On Wed, Oct 05, 2011 at 04:19:39PM +0200, Peter Zijlstra wrote: > > On Wed, 2011-10-05 at 14:01 +0200, Gleb Natapov wrote: > > > +static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) > > > +{ &

Re: [PATCH RFC V5 00/11] Paravirtualized ticketlocks

2011-10-13 Thread Peter Zijlstra
On Wed, 2011-10-12 at 17:51 -0700, Jeremy Fitzhardinge wrote: > > This is is all unnecessary complication if you're not using PV ticket > locks, it also uses the jump-label machinery to use the standard > "add"-based unlock in the non-PV case. > > if (TICKET_SLOWPATH_FLAG && >

Re: [PATCHv2 6/9] perf: expose perf capability to other modules.

2011-11-07 Thread Peter Zijlstra
On Thu, 2011-11-03 at 14:33 +0200, Gleb Natapov wrote: > @@ -1580,6 +1580,8 @@ __init int intel_pmu_init(void) > x86_pmu.num_counters= eax.split.num_counters; > x86_pmu.cntval_bits = eax.split.bit_width; > x86_pmu.cntval_mask = (1ULL << ea

Re: [PATCHv2 7/9] KVM: Expose the architectural performance monitoring CPUID leaf

2011-11-07 Thread Peter Zijlstra
On Thu, 2011-11-03 at 14:33 +0200, Gleb Natapov wrote: > + case 0xa: { /* Architectural Performance Monitoring */ > + struct x86_pmu_capability cap; > + > + perf_get_x86_pmu_capability(&cap); > + > + /* > +* Only support guest architec

Re: [PATCHv2 2/9] KVM: Expose a version 2 architectural PMU to a guests

2011-11-07 Thread Peter Zijlstra
On Thu, 2011-11-03 at 14:33 +0200, Gleb Natapov wrote: > @@ -35,6 +35,7 @@ config KVM > select KVM_MMIO > select TASKSTATS > select TASK_DELAY_ACCT > + select PERF_EVENTS Do you really want to make that an unconditional part of KVM? I know we can't currently build x8

Re: [PATCHv2 2/9] KVM: Expose a version 2 architectural PMU to a guests

2011-11-07 Thread Peter Zijlstra
On Thu, 2011-11-03 at 14:33 +0200, Gleb Natapov wrote: > +static void kvm_perf_overflow_intr(struct perf_event *perf_event, > + struct perf_sample_data *data, struct pt_regs *regs) > +{ > + struct kvm_pmc *pmc = perf_event->overflow_handler_context; > + struct kvm_pmu *pmu

Re: [PATCHv2 2/9] KVM: Expose a version 2 architectural PMU to a guests

2011-11-07 Thread Peter Zijlstra
On Thu, 2011-11-03 at 14:33 +0200, Gleb Natapov wrote: > +static u64 read_pmc(struct kvm_pmc *pmc) > +{ > + u64 counter, enabled, running; > + > + counter = pmc->counter; > + > + if (pmc->perf_event) > + counter += perf_event_read_value(pmc->perf_event, > +

Re: [PATCHv2 2/9] KVM: Expose a version 2 architectural PMU to a guests

2011-11-07 Thread Peter Zijlstra
On Mon, 2011-11-07 at 16:46 +0200, Avi Kivity wrote: > On 11/07/2011 04:34 PM, Peter Zijlstra wrote: > > On Thu, 2011-11-03 at 14:33 +0200, Gleb Natapov wrote: > > > +static void kvm_perf_overflow_intr(struct perf_event *perf_event, > > > + struct per

Re: [PATCHv2 7/9] KVM: Expose the architectural performance monitoring CPUID leaf

2011-11-07 Thread Peter Zijlstra
On Mon, 2011-11-07 at 17:41 +0200, Gleb Natapov wrote: > > > + entry->eax = min(cap.version, 2) > > > + | (cap.num_counters_gp << 8) > > > + | (cap.bit_width_gp << 16) > > > + | (cap.events_mask_len << 24); > Do you

Re: [PATCHv2 6/9] perf: expose perf capability to other modules.

2011-11-07 Thread Peter Zijlstra
On Mon, 2011-11-07 at 17:53 +0200, Gleb Natapov wrote: > I removed branch-miss-retired here because for perf user it exists. Perf > approximates it by other event but perf user shouldn't know that. A > guest is not always runs with exactly same cpu model number as a host, > so if we will not drop t

Re: [PATCHv2 2/9] KVM: Expose a version 2 architectural PMU to a guests

2011-11-07 Thread Peter Zijlstra
On Mon, 2011-11-07 at 17:25 +0200, Avi Kivity wrote: > On 11/07/2011 05:19 PM, Gleb Natapov wrote: > > > > > > note, this needs a fairly huge PMI skew to happen. > > > > > No, it need not. It is enough to get exit reason as hlt instead of nmi > > for a vcpu to go to blocking state instead of reen

Re: [PATCHv2 6/9] perf: expose perf capability to other modules.

2011-11-07 Thread Peter Zijlstra
On Mon, 2011-11-07 at 18:22 +0200, Gleb Natapov wrote: > > Right, so what model number do you expose? > Depends on what management wants. You can specify -cpu Nehalem or -cpu > Conroe or even override model manually by doing -cpu host,model=15. Oh cute ;-) -- To unsubscribe from this list: send

Re: [PATCHv2 2/9] KVM: Expose a version 2 architectural PMU to a guests

2011-11-07 Thread Peter Zijlstra
On Mon, 2011-11-07 at 17:25 +0200, Gleb Natapov wrote: > > Since the below programming doesn't use perf_event_attr::pinned, yes. > > > Yes, that is on todo :). Actually I do want to place all guest perf > counters into the same event group and make it pinned. But currently perf > event groups are

Re: [F.A.Q.] perf ABI backwards and forwards compatibility

2011-11-08 Thread Peter Zijlstra
On Tue, 2011-11-08 at 11:22 +0100, Ingo Molnar wrote: > > We do even more than that, the perf ABI is fully backwards *and* > forwards compatible: you can run older perf on newer ABIs and newer > perf on older ABIs. The ABI yes, the tool no, the tool very much relies on some newer ABI parts. Su

Re: [F.A.Q.] perf ABI backwards and forwards compatibility

2011-11-08 Thread Peter Zijlstra
On Tue, 2011-11-08 at 13:15 +0100, Ingo Molnar wrote: > > The one notable thing that isnt being tested in a natural way is the > 'group of events' abstraction - which, ironically, has been added on > the perfmon guys' insistence. No app beyond the PAPI self-test makes > actual use of it though,

Re: [PATCHv2 6/9] perf: expose perf capability to other modules.

2011-11-08 Thread Peter Zijlstra
On Tue, 2011-11-08 at 14:49 +0200, Gleb Natapov wrote: > > It might make sense to introduce cpuid10_ebx or so, also I think the > cpuid10_ebx will have only one field though (event_mask). > > > At the very least add a full ebx iteration to disable unsupported events > > in the intel-v1 case. > I d

Re: [PATCHv2 6/9] perf: expose perf capability to other modules.

2011-11-08 Thread Peter Zijlstra
On Tue, 2011-11-08 at 15:54 +0200, Gleb Natapov wrote: > Isn't it better to introduce mapping between ebx bits and architectural > events and do for_each_set_bit loop? Probably, but I only thought of that halfway through ;-) > But I wouldn't want to introduce > patch as below as part of this ser

Re: [PATCHv2 6/9] perf: expose perf capability to other modules.

2011-11-08 Thread Peter Zijlstra
On Tue, 2011-11-08 at 16:18 +0200, Gleb Natapov wrote: > On Tue, Nov 08, 2011 at 03:12:27PM +0100, Peter Zijlstra wrote: > > On Tue, 2011-11-08 at 15:54 +0200, Gleb Natapov wrote: > > > Isn't it better to introduce mapping between ebx bits and architectural > > >

Re: [F.A.Q.] perf ABI backwards and forwards compatibility

2011-11-09 Thread Peter Zijlstra
On Tue, 2011-11-08 at 13:59 +0100, Ingo Molnar wrote: > > > Also the self monitor stuff, perf-tool doesn't use that for obvious > > reasons. > > Indeed, and that's PAPI's strong point. > > We could try to utilize it via some clever LD_PRELOAD trickery? Wouldn't be really meaningful, a perf-tes

Re: [F.A.Q.] the advantages of a shared tool/kernel Git repository, tools/perf/ and tools/kvm/

2011-11-09 Thread Peter Zijlstra
On Wed, 2011-11-09 at 10:33 -0200, Arnaldo Carvalho de Melo wrote: > > Ingo, would that G+ page be useful for that? > *groan* Can we please keep things sane? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo inf

Re: [PATCHv3 06/10] x86, perf: disable non available architectural events.

2011-11-17 Thread Peter Zijlstra
On Thu, 2011-11-10 at 14:57 +0200, Gleb Natapov wrote: > + > + /* disable event that reported as not presend by cpuid */ > + for_each_set_bit(bit, x86_pmu.events_mask, > + min(x86_pmu.events_mask_len, x86_pmu.max_events)) > + intel_perfmon_event_map[i

Re: [PATCHv3 00/10] KVM in-guest performance monitoring

2011-11-17 Thread Peter Zijlstra
On Thu, 2011-11-10 at 14:57 +0200, Gleb Natapov wrote: > This patchset exposes an emulated version 2 architectural performance > monitoring unit to KVM guests. The PMU is emulated using perf_events, > so the host kernel can multiplex host-wide, host-user, and the > guest on available resources. >

Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding

2011-11-21 Thread Peter Zijlstra
On Mon, 2011-11-21 at 20:48 +0530, Bharata B Rao wrote: > I looked at Peter's recent work in this area. > (https://lkml.org/lkml/2011/11/17/204) > > It introduces two interfaces: > > 1. ms_tbind() to bind a thread to a memsched(*) group > 2. ms_mbind() to bind a memory region to memsched group >

Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding

2011-11-21 Thread Peter Zijlstra
On Mon, 2011-11-21 at 21:30 +0530, Bharata B Rao wrote: > > In the original post of this mail thread, I proposed a way to export > guest RAM ranges (Guest Physical Address-GPA) and their corresponding host > host virtual mappings (Host Virtual Address-HVA) from QEMU (via QEMU monitor). > The idea

Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding

2011-11-21 Thread Peter Zijlstra
On Mon, 2011-11-21 at 20:03 +0200, Avi Kivity wrote: > > Does ms_mbind() require that its vmas in its area be completely > contained in the region, or does it split vmas on demand? I suggest the > latter to avoid exposing implementation details. as implemented (which is still rather incomplete)

Re: [RFC -v5 PATCH 2/4] sched: Add yield_to(task, preempt) functionality.

2011-01-14 Thread Peter Zijlstra
On Fri, 2011-01-14 at 03:03 -0500, Rik van Riel wrote: > From: Mike Galbraith > > Currently only implemented for fair class tasks. > > Add a yield_to_task method() to the fair scheduling class. allowing the > caller of yield_to() to accelerate another thread in it's thread group, > task group. >

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-19 Thread Peter Zijlstra
On Wed, 2011-01-19 at 22:42 +0530, Srivatsa Vaddagiri wrote: > Add two hypercalls to KVM hypervisor to support pv-ticketlocks. > > KVM_HC_WAIT_FOR_KICK blocks the calling vcpu until another vcpu kicks it or it > is woken up because of an event like interrupt. > > KVM_HC_KICK_CPU allows the callin

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-19 Thread Peter Zijlstra
On Wed, 2011-01-19 at 22:53 +0530, Srivatsa Vaddagiri wrote: > On Wed, Jan 19, 2011 at 10:42:39PM +0530, Srivatsa Vaddagiri wrote: > > Add two hypercalls to KVM hypervisor to support pv-ticketlocks. > > > > KVM_HC_WAIT_FOR_KICK blocks the calling vcpu until another vcpu kicks it or > > it > > is

Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-20 Thread Peter Zijlstra
On Thu, 2011-01-20 at 17:29 +0530, Srivatsa Vaddagiri wrote: > > If we had a yield-to [1] sort of interface _and_ information on which vcpu > owns a lock, then lock-spinners can yield-to the owning vcpu, and then I'd nak it for being stupid ;-) really, yield*() is retarded, never even consider

Re: [RFC -v6 PATCH 2/8] sched: limit the scope of clear_buddies

2011-01-24 Thread Peter Zijlstra
On Thu, 2011-01-20 at 16:33 -0500, Rik van Riel wrote: > The clear_buddies function does not seem to play well with the concept > of hierarchical runqueues. In the following tree, task groups are > represented by 'G', tasks by 'T', next by 'n' and last by 'l'. > > (nl) > /\ >G(nl

Re: [RFC -v6 PATCH 3/8] sched: use a buddy to implement yield_task_fair

2011-01-24 Thread Peter Zijlstra
On Thu, 2011-01-20 at 16:33 -0500, Rik van Riel wrote: > Use the buddy mechanism to implement yield_task_fair. This > allows us to skip onto the next highest priority se at every > level in the CFS tree, unless doing so would introduce gross > unfairness in CPU time distribution. > > We order the

Re: [RFC -v6 PATCH 4/8] sched: Add yield_to(task, preempt) functionality

2011-01-24 Thread Peter Zijlstra
On Thu, 2011-01-20 at 16:34 -0500, Rik van Riel wrote: > From: Mike Galbraith > > Currently only implemented for fair class tasks. > > Add a yield_to_task method() to the fair scheduling class. allowing the > caller of yield_to() to accelerate another thread in it's thread group, > task group. >

Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power

2011-01-24 Thread Peter Zijlstra
t scheduler > would wrongly think that all cpus have the same ability to run processes, > lowering the overall throughput. > > Signed-off-by: Glauber Costa > CC: Rik van Riel > CC: Jeremy Fitzhardinge > CC: Peter Zijlstra > CC: Avi Kivity > --- > include/linux/sched.

Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power

2011-01-24 Thread Peter Zijlstra
On Mon, 2011-01-24 at 16:51 -0200, Glauber Costa wrote: > > I would really much rather see you change update_rq_clock_task() and > > subtract your ns resolution steal time from our wall-time, > > update_rq_clock_task() already updates the cpu_power relative to the > > remaining time available. > >

Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power

2011-01-24 Thread Peter Zijlstra
On Mon, 2011-01-24 at 16:51 -0200, Glauber Costa wrote: > > > I thought kvm had a ns resolution steal-time clock? > Yes, the one I introduced earlier in this series is nsec. However, user > and system will be accounted in usec at most, so there is no point in > using nsec here. Well, the schedule

Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power

2011-01-25 Thread Peter Zijlstra
On Tue, 2011-01-25 at 18:02 -0200, Glauber Costa wrote: > I fail to see how does clock_task influence cpu power. > If we also have to touch clock_task for better accounting of other > stuff, it is a separate story. > But for cpu_power, I really fail. Please enlighten me. static void update_rq_clo

Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power

2011-01-25 Thread Peter Zijlstra
On Tue, 2011-01-25 at 18:47 -0200, Glauber Costa wrote: > On Tue, 2011-01-25 at 21:13 +0100, Peter Zijlstra wrote: > > On Tue, 2011-01-25 at 18:02 -0200, Glauber Costa wrote: > > > > > I fail to see how does clock_task influence cpu power. > > > If we also ha

Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power

2011-01-26 Thread Peter Zijlstra
On Tue, 2011-01-25 at 19:27 -0200, Glauber Costa wrote: > On Tue, 2011-01-25 at 22:07 +0100, Peter Zijlstra wrote: > > On Tue, 2011-01-25 at 18:47 -0200, Glauber Costa wrote: > > > On Tue, 2011-01-25 at 21:13 +0100, Peter Zijlstra wrote: > > > > On Tue, 2011-01-25

Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power

2011-01-26 Thread Peter Zijlstra
On Wed, 2011-01-26 at 13:43 -0200, Glauber Costa wrote: > yes, but once this delta is subtracted from rq->clock_task, this value is not > used to dictate power, unless I am mistaken. > > power is adjusted according to scale_rt_power(), which does it using the > values of rq->rt_avg, rq->age_stamp

Re: [PATCH 16/16] KVM-GST: adjust scheduler cpu power

2011-01-26 Thread Peter Zijlstra
On Wed, 2011-01-26 at 17:46 +0100, Peter Zijlstra wrote: > it uses a per-cpu virt_steal_time() clock which is > expected to return steal-time in ns. This clock should return u64 and wrap on u64 and be provided when CONFIG_SCHED_PARAVIRT. -- To unsubscribe from this list: send th

Re: [PATCH v2 2/6] KVM-HV: KVM Steal time implementation

2011-01-31 Thread Peter Zijlstra
On Fri, 2011-01-28 at 14:52 -0500, Glauber Costa wrote: > + u64 to = (get_kernel_ns() - vcpu->arch.this_time_out); > + /* > +* using nanoseconds introduces noise, which accumulates > easily > +* leading to big steal time values. We want,

Re: [PATCH v2 4/6] KVM-GST: KVM Steal time registration

2011-01-31 Thread Peter Zijlstra
On Fri, 2011-01-28 at 14:52 -0500, Glauber Costa wrote: > + /* > +* using nanoseconds introduces noise, which accumulates easily > +* leading to big steal time values. We want, however, to keep the > +* interface nanosecond-based for future-proofness. The hypervisor ma

Re: [PATCH v2 5/6] KVM-GST: adjust scheduler cpu power

2011-01-31 Thread Peter Zijlstra
On Fri, 2011-01-28 at 14:52 -0500, Glauber Costa wrote: > +#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING > +static DEFINE_PER_CPU(u64, cpu_steal_time); > + > +#ifndef CONFIG_64BIT > +static DEFINE_PER_CPU(seqcount_t, steal_time_seq); > + > +static inline void steal_time_write_begin(void) > +{ > + __t

Re: [PATCH v2 5/6] KVM-GST: adjust scheduler cpu power

2011-01-31 Thread Peter Zijlstra
On Mon, 2011-01-31 at 12:25 +0100, Peter Zijlstra wrote: > On Fri, 2011-01-28 at 14:52 -0500, Glauber Costa wrote: > > > +#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING > > +static DEFINE_PER_CPU(u64, cpu_steal_time); > > + > > +#ifndef CONFIG_64BIT > > +static DEFIN

Re: [RFC -v7 PATCH 3/7] sched: use a buddy to implement yield_task_fair

2011-01-31 Thread Peter Zijlstra
On Wed, 2011-01-26 at 17:21 -0500, Rik van Riel wrote: > +static struct sched_entity *__pick_second_entity(struct cfs_rq *cfs_rq) > +{ > + struct rb_node *left = cfs_rq->rb_leftmost; > + struct rb_node *second; > + > + if (!left) > + return NULL; > + > + second = rb_nex

Re: [RFC -v7 PATCH 4/7] Add yield_to(task, preempt) functionality.

2011-01-31 Thread Peter Zijlstra
On Wed, 2011-01-26 at 17:21 -0500, Rik van Riel wrote: > +bool __sched yield_to(struct task_struct *p, bool preempt) > +{ > + struct task_struct *curr = current; > + struct rq *rq, *p_rq; > + unsigned long flags; > + bool yielded = 0; > + > + local_irq_save(flags); > +

Re: [RFC -v7 PATCH 5/7] export pid symbols needed for kvm_vcpu_on_spin

2011-01-31 Thread Peter Zijlstra
On Wed, 2011-01-26 at 17:23 -0500, Rik van Riel wrote: > Export the symbols required for a race-free kvm_vcpu_on_spin. Avi, you asked for an example of why I hated KVM as a module :-) > Signed-off-by: Rik van Riel > > diff --git a/kernel/fork.c b/kernel/fork.c > index 3b159c5..adc8f47 100644 >

Re: [RFC -v7 PATCH 5/7] export pid symbols needed for kvm_vcpu_on_spin

2011-01-31 Thread Peter Zijlstra
On Mon, 2011-01-31 at 15:26 +0200, Avi Kivity wrote: > On 01/31/2011 01:51 PM, Peter Zijlstra wrote: > > On Wed, 2011-01-26 at 17:23 -0500, Rik van Riel wrote: > > > Export the symbols required for a race-free kvm_vcpu_on_spin. > > > > Avi, you asked for an example

Re: [PATCH -v8 0/7] directed yield for Pause Loop Exiting

2011-02-01 Thread Peter Zijlstra
On Mon, 2011-01-31 at 16:40 -0500, Rik van Riel wrote: > > v8: > - some more changes and cleanups suggested by Peter Did you, by accident, send out the -v7 patches again? I don't think I've spotted a difference.. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a

Re: [PATCH -v8a 4/7] sched: Add yield_to(task, preempt) functionality

2011-02-01 Thread Peter Zijlstra
On Tue, 2011-02-01 at 09:50 -0500, Rik van Riel wrote: > +/** > + * yield_to - yield the current processor to another thread in > + * your thread group, or accelerate that thread toward the > + * processor it's on. > + * > + * It's the caller's job to ensure that the target task struct > + * can't

Re: [PATCH -v8a 3/7] sched: use a buddy to implement yield_task_fair

2011-02-01 Thread Peter Zijlstra
On Tue, 2011-02-01 at 09:51 -0500, Rik van Riel wrote: > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -375,13 +375,6 @@ static struct ctl_table kern_table[] = { > .mode = 0644, > .proc_handler = sched_rt_handler, > }, > - { > -

Re: [PATCH v2 4/6] KVM-GST: KVM Steal time registration

2011-02-01 Thread Peter Zijlstra
On Tue, 2011-02-01 at 13:53 -0200, Glauber Costa wrote: > > And since the granularity of the cpu accounting is too coarse, we end up > with much more steal time than we should, because things that are less > than 1 unity of cputime, are often rounded up to 1 unity of cputime. See, that! is the pr

Re: [PATCH v2 5/6] KVM-GST: adjust scheduler cpu power

2011-02-01 Thread Peter Zijlstra
On Tue, 2011-02-01 at 13:59 -0200, Glauber Costa wrote: > > Because that part is kvm-specific, and this is scheduler general. > It seemed cleaner to me to do it this way. But I can do it differently, > certainly. Well, any steal time clock will be hypervisor specific, but if we agree that anythi

Re: [PATCH v2 4/6] KVM-GST: KVM Steal time registration

2011-02-01 Thread Peter Zijlstra
On Tue, 2011-02-01 at 15:00 -0200, Glauber Costa wrote: > > > What you can do is: steal_ticks = steal_time_clock() / TICK_NSEC, or > > simply keep a steal time delta and every time it overflows > > cputime_one_jiffy insert a steal-time tick. > > What do you think about keeping accounting in msec/

Re: [PATCH v2 5/6] KVM-GST: adjust scheduler cpu power

2011-02-01 Thread Peter Zijlstra
On Tue, 2011-02-01 at 14:22 -0200, Glauber Costa wrote: > > > Which tick accounting? In your other e-mail , you pointed that this only > runs in touch_steal_time, which is fine, will change. That tick ;-), all the account_foo muck is per tick. > But all the rest > here, that is behind the hype

Re: [PATCH v2 5/6] KVM-GST: adjust scheduler cpu power

2011-02-01 Thread Peter Zijlstra
On Tue, 2011-02-01 at 17:55 -0200, Glauber Costa wrote: > > update_rq_clock_task still have to keep track of what was the last steal > time value we saw, in the same way it does for irq. Right, the CONFIG_SCHED_PARAVIRT patch I sent earlier adds a prev_steal_time member to struct rq for this purp

Re: [PATCH -v8a 3/7] sched: use a buddy to implement yield_task_fair

2011-02-03 Thread Peter Zijlstra
On Tue, 2011-02-01 at 09:51 -0500, Rik van Riel wrote: > -static void yield_task_fair(struct rq *rq) > -{ > - struct task_struct *curr = rq->curr; > - struct cfs_rq *cfs_rq = task_cfs_rq(curr); > - struct sched_entity *rightmost, *se = &curr->se; > - > - /* > -* Are

Re: [PATCH -v8a 4/7] sched: Add yield_to(task, preempt) functionality

2011-02-03 Thread Peter Zijlstra
On Tue, 2011-02-01 at 09:50 -0500, Rik van Riel wrote: > +bool __sched yield_to(struct task_struct *p, bool preempt) > +{ > + struct task_struct *curr = current; > + struct rq *rq, *p_rq; > + unsigned long flags; > + bool yielded = 0; > + > + local_irq_save(flags); > +

Re: [PATCH v3 5/6] KVM-GST: adjust scheduler cpu power

2011-02-11 Thread Peter Zijlstra
On Fri, 2011-02-11 at 13:19 -0500, Glauber Costa wrote: > static void update_rq_clock_task(struct rq *rq, s64 delta) > { > + s64 irq_delta = 0, steal = 0; > > +#ifdef CONFIG_IRQ_TIME_ACCOUNTING > irq_delta = irq_time_read(cpu_of(rq)) - rq->prev_irq_time; > > /* > @@ -1926,20

Re: [PATCH v3 3/6] KVM-GST: KVM Steal time accounting

2011-02-11 Thread Peter Zijlstra
On Fri, 2011-02-11 at 13:19 -0500, Glauber Costa wrote: > diff --git a/include/linux/sched.h b/include/linux/sched.h > index d747f94..5dbf509 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -302,6 +302,7 @@ long io_schedule_timeout(long timeout); > extern void cpu_init (vo

Re: [PATCH v3 3/6] KVM-GST: KVM Steal time accounting

2011-02-15 Thread Peter Zijlstra
On Tue, 2011-02-15 at 16:35 +0200, Avi Kivity wrote: > On 02/11/2011 08:19 PM, Glauber Costa wrote: > > This patch accounts steal time time in kernel/sched. > > I kept it from last proposal, because I still see advantages > > in it: Doing it here will give us easier access from scheduler > > variab

Re: [PATCH v3 3/6] KVM-GST: KVM Steal time accounting

2011-02-15 Thread Peter Zijlstra
On Tue, 2011-02-15 at 17:17 +0200, Avi Kivity wrote: > > Ah, so we're all set. Do you know if any user tools process this > information? I suppose there are, I bet Jeremy knows, Xen after all supports this stuff ;-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body

  1   2   3   4   5   6   >