On Thu, 2012-09-06 at 20:01 +0200, Oleg Nesterov wrote:
> Ping...
Right, email backlog :-)
> Peter, do you think you can do your make-it-lockless patch (hehe, I
> think this is not possible ;) on top?
Sure, I was trying to see if I could play games with the _cancel
semantics that would satisfy t
On Thu, 2012-09-06 at 17:46 +0200, Jiri Olsa wrote:
> The 'perf diff' and 'std/hist' code is now changed to allow computations
> mentioned in the paper. Two of them are implemented within this patchset:
> 1) ratio differential profiling
> 2) weighted differential profiling
Seems like a useful
On Thu, 2012-09-06 at 11:49 -0700, Josh Triplett wrote:
>
> Huh, I thought GCC knew to not emit that warning unless it actually
> found control flow reaching the end of the function; since the
> infinite
> loop has no break in it, you shouldn't need the return. Annoying.
tag the function with _
On Thu, 2012-09-06 at 13:51 -0700, Paul E. McKenney wrote:
> On Thu, Sep 06, 2012 at 04:38:32PM +0200, Peter Zijlstra wrote:
> > On Thu, 2012-08-30 at 11:56 -0700, Paul E. McKenney wrote:
> > > +#ifdef CONFIG_PROVE_RCU_DELAY
> > > + udelay(10); /* Ma
On Thu, 2012-09-06 at 15:22 -0700, Paul E. McKenney wrote:
> Ah!
>
> It is perfectly legal to avoid -starting- an RCU grace period for a
> minute, or even longer. If RCU has nothing to do, in other words, if no
> one registers any RCU callbacks, then RCU need not start a grace period.
>
> Of cou
On Thu, 2012-09-06 at 14:25 -0700, Paul E. McKenney wrote:
> On Thu, Sep 06, 2012 at 08:41:09PM +0200, Peter Zijlstra wrote:
> > On Thu, 2012-09-06 at 17:46 +0200, Jiri Olsa wrote:
> > > The 'perf diff' and 'std/hist' code is now changed to allow computation
t; bootup. If no new processes are spawed or no idle cycles happen, the
> load on the cpus will remain unbalanced for that duration.
>
> Signed-off-by: Diwakar Tundlam
> Signed-off-by: Peter Zijlstra
> Link:
> http://lkml.kernel.org/r/1dd7bfedd3147247b1355b
On Fri, 2012-09-07 at 22:33 +0900, Namhyung Kim wrote:
> 2012-09-07 (금), 11:28 +0200, Jiri Olsa:
> > On Fri, Sep 07, 2012 at 02:58:19PM +0900, Namhyung Kim wrote:
> > > I don't see why this do { } while(0) loop is necessary.
> > > How about this?
> > >
> > > w1 = strtol(opt, &tmp, 10);
> > > i
On Fri, 2012-09-07 at 16:29 +0200, Stephane Eranian wrote:
> @@ -148,6 +148,15 @@ static LIST_HEAD(pmus);
> static DEFINE_MUTEX(pmus_lock);
> static struct srcu_struct pmus_srcu;
>
> +struct perf_cpu_hrtimer {
> + struct hrtimer hrtimer;
> + int active;
> +};
> +
> +static DEFINE_PE
On Fri, 2012-09-07 at 16:29 +0200, Stephane Eranian wrote:
> Obsolete because superseded by hrtimer based
> multiplexing.
Not entirely, the jiffies_interval allows different PMUs to have
different rotation speeds. Your code doesn't allow this.
--
To unsubscribe from this list: send the line "unsu
On Fri, 2012-09-07 at 16:29 +0200, Stephane Eranian wrote:
Style nit:
> + if (h->active)
> + list_for_each_entry_safe(cpuctx, tmp, head, rotation_list)
> + rotations += perf_rotate_context(cpuctx);
> + if (!hrtimer_callback_running(hr))
> + __
On Fri, 2012-09-07 at 08:31 -0700, Arnaldo Carvalho de Melo wrote:
> People don't like goto's, but that is overstated, for error handling
> it
> is perfectly fine :-)
http://marc.info/?l=linux-arch&m=120852974023791&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" i
On Fri, 2012-09-07 at 18:41 +0200, Robert Richter wrote:
> From 1d037614edef576da441936bd8c917d31f57b179 Mon Sep 17 00:00:00 2001
> From: Robert Richter
> Date: Wed, 25 Jul 2012 19:12:45 +0200
> Subject: [PATCH] perf, ibs: Check syscall attribute flags
>
> Current implementation simply ignores at
On Fri, 2012-09-07 at 19:03 +0200, Stephane Eranian wrote:
> I think having different intervals would be a good thing, especially for
> uncore.
> But now, I am wondering how this could work without too much overhead.
> Looks like you're suggesting arming multiple hrtimers if multiple PMU are
> ove
On Fri, 2012-09-07 at 19:18 +0200, Robert Richter wrote:
> I was thinking of this too. But this breaks existing code to compile
> since static initialization of struct perf_event_attr fails, e.g.:
>
> builtin-test.c:469:3: error: unknown field ‘watermark’ specified in
> initializer
>
>
Oh bugg
On Fri, 2012-09-07 at 21:10 +0200, Stephane Eranian wrote:
>
> That's true. I started modifying my code to implement your suggestion.
> We'll see how it goes. Then we would have to export that mux interval
> via sysfs for each PMU.
Indeed. Thanks!
--
To unsubscribe from this list: send the line
On Fri, 2012-09-07 at 11:39 -0700, Linus Torvalds wrote:
> Al? Please look into this. I'm not entirely sure what's going on, but
> lockdep complains about this:
>
> Possible interrupt unsafe locking scenario:
>
>CPU0CPU1
>
> lock(
On Sun, 2012-09-09 at 01:19 +0300, Irina Tirdea wrote:
> >> +#ifndef __WORDSIZE
> >> +#if defined(__x86_64__)
> >> +# define __WORDSIZE 64
> >> +#endif
> >> +#if defined(__i386__) || defined(__arm__)
> >> +# define __WORDSIZE 32
> >> +#endif
> >> +#endif
> >
> > Why not use "sizeof(unsigned long) *
On Mon, 2012-09-10 at 15:10 +0800, Alex Shi wrote:
> There is no load_balancer to be selected now. It just set state of
> nohz tick stopping.
>
> So rename the function, pass the 'cpu' from parameter and then
> remove the useless calling from tick_nohz_restart_sched_tick().
Please check who wrote
On Mon, 2012-09-10 at 08:16 -0500, Andrew Theurer wrote:
> > > @@ -4856,8 +4859,6 @@ again:
> > > if (curr->sched_class != p->sched_class)
> > > goto out;
> > >
> > > - if (task_running(p_rq, p) || p->state)
> > > - goto out;
> >
> > Is it possible that by this time th
On Mon, 2012-09-10 at 15:53 +0800, Yan, Zheng wrote:
> Hi,
>
> This patchset add a cpumask file to the uncore pmu sysfs directory.
> If user doesn't explicitly specify CPU list, perf-stat only collects
> uncore events on CPUs listed in the cpumask file.
>
> As Stephane suggested, make perf-stat r
On Mon, 2012-09-10 at 18:50 +0200, Jiri Olsa wrote:
> + maps = fopen("/proc/self/maps", "r");
> + if (!maps) {
> + pr_err("vdso: cannot open maps\n");
> + return -1;
> + }
> +
> + while (!found && fgets(line, sizeof(line), maps)) {
> +
On Mon, 2012-09-10 at 10:40 -0600, David Ahern wrote:
> Hopefully thi wraps up the precise mode-exclude_guest dependency.
> I'm sure someone will let me know if I screwed up the attribution
> in the second patch.
I'll wait with applying until we have the IBS stuff sorted, other than
that, thanks
On Mon, 2012-09-10 at 22:26 +0530, Srikar Dronamraju wrote:
> > +static bool __yield_to_candidate(struct task_struct *curr, struct
> > task_struct *p)
> > +{
> > + if (!curr->sched_class->yield_to_task)
> > + return false;
> > +
> > + if (curr->sched_class != p->sched_class)
>
On Mon, 2012-09-10 at 11:01 -0600, David Ahern wrote:
> On 9/10/12 10:57 AM, Peter Zijlstra wrote:
> > On Mon, 2012-09-10 at 10:40 -0600, David Ahern wrote:
> >> Hopefully thi wraps up the precise mode-exclude_guest dependency.
> >> I'm sure someone will let me know
; Signed-off-by: David Ahern
> Cc: Ingo Molnar
> Cc: Peter Zijlstra
> Cc: Robert Richter
> Cc: Gleb Natapov
> Cc: Avi Kivity
> ---
> tools/perf/util/parse-events.c |3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/tools/perf/util/parse-events.c b/
On Mon, 2012-09-10 at 10:40 -0600, David Ahern wrote:
> From: Peter Zijlstra
>
> See https://lkml.org/lkml/2012/7/9/298
Expanding that a little would be so much better.. take some of the reply
to 1/3 on why we have to enforce a strict exclude_guest.
--
To unsubscribe from this list:
On Mon, 2012-09-10 at 18:57 +0200, Sebastian Andrzej Siewior wrote:
> The only user that is touching this bits in irq context is perf. perf
> uses raw_local_irqsave() (raw_* most likely due to -RT).
# git grep raw_local_irq arch/x86/kernel/cpu/perf_* kernel/events/ | wc -l
0
I think you're confus
On Mon, 2012-09-10 at 15:12 -0500, Andrew Theurer wrote:
> + /*
> +* if the target task is not running, then only yield if the
> +* current task is in guest mode
> +*/
> + if (!(p_rq->curr->flags & PF_VCPU))
> + goto out_irq;
This would make yield
On Sat, 2012-10-06 at 09:39 +0200, Ingo Molnar wrote:
Thanks Ingo! Paul,
> tip/kernel/sched/fair.c | 28 ++--
> 1 file changed, 18 insertions(+), 10 deletions(-)
>
> Index: tip/kernel/sched/fair.c
> ===
>
On Mon, 2012-10-08 at 10:59 +0800, Tang Chen wrote:
> If a cpu is offline, its nid will be set to -1, and cpu_to_node(cpu) will
> return -1. As a result, cpumask_of_node(nid) will return NULL. In this case,
> find_next_bit() in for_each_cpu will get a NULL pointer and cause panic.
Hurm,. this is n
On Mon, 2012-10-08 at 14:38 +0200, Oleg Nesterov wrote:
> But the code looks more complex, and the only advantage is that
> non-exiting task does xchg() instead of cmpxchg(). Not sure this
> worth the trouble, in this case task_work_run() will likey run
> the callbacks (the caller checks ->task_wor
On Tue, 2012-10-09 at 06:37 -0700, Andi Kleen wrote:
> Ivo Sieben writes:
>
> > Check the waitqueue task list to be non empty before entering the critical
> > section. This prevents locking the spin lock needlessly in case the queue
> > was empty, and therefor also prevent scheduling overhead on
On Tue, 2012-10-09 at 17:38 +0200, Andre Przywara wrote:
> First you need an AMD family 10h/12h CPU. These do not reset the
> PERF_CTR registers on a reboot.
> Now you boot bare metal Linux, which goes successfully through this
> check, but leaves the magic value of 0xabcd in the register. You
> do
On Tue, 2012-10-09 at 13:36 -0700, David Rientjes wrote:
> On Tue, 9 Oct 2012, Peter Zijlstra wrote:
>
> > On Mon, 2012-10-08 at 10:59 +0800, Tang Chen wrote:
> > > If a cpu is offline, its nid will be set to -1, and cpu_to_node(cpu) will
> > > return -1. As a res
On Tue, 2012-10-09 at 16:27 -0700, David Rientjes wrote:
> On Tue, 9 Oct 2012, Peter Zijlstra wrote:
>
> > Well the code they were patching is in the wakeup path. As I think Tang
> > said, we leave !runnable tasks on whatever cpu they ran on last, even if
> > that cpu is
On Wed, 2012-10-10 at 17:33 +0800, Wen Congyang wrote:
>
> Hmm, if per-cpu memory is preserved, and we can't offline and remove
> this memory. So we can't offline the node.
>
> But, if the node is hot added, and per-cpu memory doesn't use the
> memory on this node. We can hotremove cpu/memory on
On Wed, 2012-10-10 at 18:10 +0800, Wen Congyang wrote:
> I use ./scripts/get_maintainer.pl, and it doesn't tell me that I should cc
> you when I post that patch.
That script doesn't look at all usage sites of the code you modify does
it?
You need to audit the entire tree for usage of the interfa
On Wed, 2012-10-10 at 13:29 +0100, Mel Gorman wrote:
> Do we really switch more though?
>
> Look at the difference in interrupts vs context switch. IPIs are an interrupt
> so if TTWU_QUEUE wakes process B using an IPI, does that count as a context
> switch?
Nope. Nor would it for NO_TTWU_QUEUE. A
On Wed, 2012-10-10 at 14:53 +0200, Jiri Olsa wrote:
> +static ssize_t amd_event_sysfs_show(char *page, u64 config)
> +{
> + u64 event = (config & ARCH_PERFMON_EVENTSEL_EVENT) |
> + (config & AMD64_EVENTSEL_EVENT) >> 24;
> +
> + return x86_event_sysfs_show(page, config,
On Wed, 2012-10-10 at 16:25 +0200, Jiri Olsa wrote:
> On Wed, Oct 10, 2012 at 04:11:42PM +0200, Peter Zijlstra wrote:
> > On Wed, 2012-10-10 at 14:53 +0200, Jiri Olsa wrote:
> > > +static ssize_t amd_event_sysfs_show(char *page, u64 config)
> > > +{
> &
On Wed, 2012-10-10 at 17:44 +0200, Simon Klinkert wrote:
> I'm just wondering if the 'load' is really meaningful in this
> scenario. The machine is the whole time fully responsive and looks
> fine to me but maybe I didn't understand correctly what the load
> should mean. Is there any sensible inter
On Wed, 2012-10-10 at 19:50 +0200, Oleg Nesterov wrote:
>
> But you did not answer, and I am curious. What was your original
> motivation? Is xchg really faster than cmpxchg?
And is this true over multiple architectures? Or are we optimizing for
x86_64 (again) ?
--
To unsubscribe from this list:
On Wed, 2012-10-17 at 20:29 -0700, David Rientjes wrote:
>
> Ok, thanks for the update. I agree that we should be clearing the mapping
> at node hot-remove since any cpu that would subsequently get onlined and
> assume one of the previous cpu's ids is not guaranteed to have the same
> affinity
On Thu, 2012-10-18 at 17:20 -0400, Rik van Riel wrote:
> Having the function name indicate what the function is used
> for makes the code a little easier to read. Furthermore,
> the fault handling code largely consists of do__page
> functions.
I don't much care either way, but I was thinking
er/numa-problem.txt file should
> probably be rewritten once we figure out the final details of
> what the NUMA code needs to do, and why.
>
> Signed-off-by: Rik van Riel
Acked-by: Peter Zijlstra
Thanks Rik!
--
To unsubscribe from this list: send the line "unsubscribe linux
On Thu, 2012-10-18 at 15:28 -0400, Mikulas Patocka wrote:
>
> On Thu, 18 Oct 2012, Oleg Nesterov wrote:
>
> > Ooooh. And I just noticed include/linux/percpu-rwsem.h which does
> > something similar. Certainly it was not in my tree when I started
> > this patch... percpu_down_write() doesn't allow
On Fri, 2012-10-19 at 01:21 -0400, Dave Jones wrote:
> > Not sure why you are CC'ing a call site, rather than the maintainers of
> > the code. Just looks like lockdep is using too small a static value.
> > Though it is pretty darn large...
>
> You're right, it's a huge chunk of memory.
> It loo
to put the sysctl enabled check in autogroup_move_group(), kernel should check
> it before autogroup_create in sched_autogroup_create_attach().
>
> Reported-by: cwillu
> Reported-by: Luis Henriques
> Signed-off-by: Xiaotian Feng
> Cc: Ingo Molnar
> Cc: Peter Zijlst
On Wed, 2012-10-17 at 11:35 -0400, Vince Weaver wrote:
>
> This is by accident; it looks like the code does
>val |= ARCH_PERFMON_EVENTSEL_ENABLE;
> in p6_pmu_disable_event() so that events are never truly disabled
> (is this a bug? should it be &=~ instead?).
I think that's on purpose.. f
On Thu, 2012-10-18 at 11:32 +0400, Vladimir Davydov wrote:
>
> 1) Do you agree that the problem exists and should be sorted out?
This is two questions.. yes it exists, I'm absolutely sure I pointed it
out as soon as people even started talking about this nonsense (bw
cruft).
Should it be sorted,
On Thu, 2012-10-18 at 09:40 -0400, Steven Rostedt wrote:
> Peter,
>
> There was a little conflict with my merge of 3.4.14 due to the backport
> of this patch:
>
> commit 947ca1856a7e60aa6d20536785e6a42dff25aa6e
> Author: Michael Wang
> Date: Wed Sep 5 10:33:18 2012 +0800
>
> slab: fix the
On Fri, 2012-10-19 at 09:51 -0400, Johannes Weiner wrote:
> Of course I'm banging my head into a wall for not seeing earlier
> through the existing migration path how easy this could be.
There's a reason I keep promoting the idea of 'someone' rewriting all
that page-migration code :-) I forever
On Fri, 2012-10-19 at 09:51 -0400, Johannes Weiner wrote:
> Right now, unlike the traditional migration path, this breaks COW for
> every migration, but maybe you don't care about shared pages in the
> first place. And fixing that should be nothing more than grabbing the
> anon_vma lock and using
On Fri, 2012-10-19 at 09:51 -0400, Johannes Weiner wrote:
> It's slightly ugly that migrate_page_copy() actually modifies the
> existing page (deactivation, munlock) when you end up having to revert
> back to it.
The worst is actually calling copy_huge_page() on a THP.. it seems to
work though ;-
On Fri, 2012-10-19 at 16:52 +0200, Stephane Eranian wrote:
> -modifier_event [ukhpGH]{1,8}
> +modifier_event [ukhpGHx]{1,8}
wouldn't the max modifier sting length grow by adding another possible
modifier?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a
On Fri, 2012-10-19 at 16:52 +0200, Stephane Eranian wrote:
> +static int intel_pebs_aliases_snb(struct perf_event *event)
> +{
> + u64 cfg = event->hw.config;
> + /*
> +* for INST_RETIRED.PREC_DIST to work correctly with PEBS, it must
> +* be measured alone on SNB (exclu
On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:
>
> If we do need the extra refcount, why is normal
> page migration safe? :)
Its mostly a matter of how convoluted you make the code, regular page
migration is about as bad as you can get
Normal does:
follow_page(FOLL_GET) +1
isolate
On Fri, 2012-10-19 at 18:31 +0200, Stephane Eranian wrote:
> On Fri, Oct 19, 2012 at 6:27 PM, Peter Zijlstra wrote:
> > On Fri, 2012-10-19 at 16:52 +0200, Stephane Eranian wrote:
> >> +static int intel_pebs_aliases_snb(struct perf_event *event)
> >> +{
> >>
On Fri, 2012-10-19 at 11:32 -0400, Mikulas Patocka wrote:
> So if you can do an alternative implementation without RCU, show it.
Uhm,,. no that's not how it works. You just don't push through crap like
this and then demand someone else does it better.
But using preempt_{disable,enable} and using
On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
> Would it make sense to have the normal page migration code always
> work with the extra refcount, so we do not have to introduce a new
> MIGRATE_FAULT migration mode?
>
> On the other hand, compaction does not take the extra reference...
R
On Thu, 2012-10-18 at 17:02 +0200, Ralf Baechle wrote:
> CC mm/huge_memory.o
> mm/huge_memory.c: In function ‘do_huge_pmd_prot_none’:
> mm/huge_memory.c:789:3: error: incompatible type for argument 3 of
> ‘update_mmu_cache’
That appears to have become update_mmu_cache_pmd(), which makes se
On Mon, 2012-10-22 at 14:11 +0800, Yan, Zheng wrote:
> + /* LBR callstack does not work well with FREEZE_LBRS_ON_PMI */
> + if (!cpuc->lbr_sel || !(cpuc->lbr_sel->config & LBR_CALL_STACK))
> + debugctl |= DEBUGCTLMSR_FREEZE_LBRS_ON_PMI;
How useful it is without this? How
On Mon, 2012-10-22 at 14:11 +0800, Yan, Zheng wrote:
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -160,8 +160,9 @@ enum perf_branch_sample_type {
> PERF_SAMPLE_BRANCH_ABORT= 1U << 7, /* transaction aborts */
> PERF_SAMPLE_BRANCH_INTX
On Sat, 2012-10-20 at 12:22 -0400, Frederic Weisbecker wrote:
> + if (empty) {
> + /*
> +* If an IPI is requested, raise it right away. Otherwise wait
> +* for the next tick unless it's stopped. Now if the arch uses
> +* some other
On Sun, 2012-10-21 at 05:56 -0700, tip-bot for Andrea Arcangeli wrote:
> In get_user_pages_fast() the TLB shootdown code can clear the pagetables
> before firing any TLB flush (the page can't be freed until the TLB
> flushing IPI has been delivered but the pagetables will be cleared well
> before
On Sat, 2012-10-20 at 21:06 +0200, Andrea Righi wrote:
> @@ -383,13 +383,7 @@ struct rq {
> struct list_head leaf_rt_rq_list;
> #endif
>
> + unsigned long __percpu *nr_uninterruptible;
This is O(nr_cpus^2) memory..
> +unsigned long nr_uninterruptible_cpu(int cpu)
> +{
> +
On Mon, 2012-10-22 at 14:55 +0300, Dan Carpenter wrote:
> Hello Peter Zijlstra,
>
> The patch 3d049f8a5398: "sched, numa, mm: Implement constant, per
> task Working Set Sampling (WSS) rate" from Oct 14, 2012, leads to the
> following warning:
> kernel/sch
On Mon, 2012-10-22 at 17:44 +0200, Stephane Eranian wrote:
>
> I know the answer, because I know what's going on under the
> hood. But what about the average user?
I'm still wondering if the avg user really thinks 'instructions' is a
useful metric for other than obtaining ipc measurements.
The
On Mon, 2012-10-22 at 18:08 +0200, Stephane Eranian wrote:
> > I'm still wondering if the avg user really thinks 'instructions' is
> a
> > useful metric for other than obtaining ipc measurements.
> >
> Yeah, for many users CPI (or IPC) is a useful metric.
Right but you don't get that using instru
straightforward, most of the patch
deals with adding the /proc/sys/kernel/sched_numa_scan_delay_ms tunable
knob.
Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Peter Zijlstra
Cc: Andrea Arcangeli
Cc: Rik van Riel
Cc: Mel Gorman
[ Wrote the changelog, ran measurements
Avoid a few #ifdef's later on.
Signed-off-by: Peter Zijlstra
Cc: Paul Turner
Cc: Lee Schermerhorn
Cc: Christoph Lameter
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Ingo Molnar
---
kernel/sched/sched.h |6 ++
1 file changed, 6 inser
out a possible NULL pointer dereference in the
first version of this patch. ]
Based-on-idea-by: Andrea Arcangeli
Bug-Found-By: Dan Carpenter
Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Peter Zijlstra
Cc: Andrea Arcangeli
Cc: Rik van Riel
Cc: Mel Gorman
[ Wrote
By accounting against the present PTEs, scanning speed reflects the
actual present (mapped) memory.
Suggested-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Peter Zijlstra
Cc: Andrea Arcangeli
Cc: Rik van Riel
Cc: Mel Gorman
Signed-off-by: Ingo
the final details of
what the NUMA code needs to do, and why. ]
Signed-off-by: Rik van Riel
Acked-by: Peter Zijlstra
Cc: Peter Zijlstra
Cc: Andrea Arcangeli
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Linus Torvalds
Cc: Andrew Morton
Signed-off-by: Ingo Molnar
This is against tip.git numa/
grow enough 64bit
only page-flags to push the last-cpu out. ]
Suggested-by: Rik van Riel
Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Peter Zijlstra
Cc: Andrea Arcangeli
Cc: Rik van Riel
Cc: Mel Gorman
Signed-off-by: Ingo Molnar
---
include/linux/mm.h
Hi,
This series implements an improved version of NUMA scheduling, based on
the review and testing feedback we got.
Like the previous version, this code is driven by working set probing
faults (so much of the VM machinery remains) - but the subsequent
utilization of those faults and the scheduler
Add THP migration for the NUMA working set scanning fault case.
It uses the page lock to serialize. No migration pte dance is
necessary because the pte is already unmapped when we decide
to migrate.
Signed-off-by: Peter Zijlstra
Cc: Johannes Weiner
Cc: Mel Gorman
Cc: Andrea Arcangeli
Cc
On Mon, 2013-04-15 at 14:02 +0800, Yan, Zheng wrote:
> From: "Yan, Zheng"
>
> If perf event buffer is in overwrite mode, the kernel only updates
> the data head when it overwrites old samples. The program that owns
> the buffer need periodically check the buffer and update a variable
> that track
On Mon, 2013-04-15 at 11:33 +0200, Ingo Molnar wrote:
> * Paul Gortmaker wrote:
>
> > Recent activity has had a focus on moving functionally related blocks of
> > stuff
> > out of sched/core.c into stand-alone files. The code relating to load
> > average
> > calculations has grown significan
On Mon, 2013-04-15 at 03:42 -0700, tip-bot for Tommi Rantala wrote:
> Commit-ID: 8176cced706b5e5d15887584150764894e94e02f
> Gitweb: http://git.kernel.org/tip/8176cced706b5e5d15887584150764894e94e02f
> Author: Tommi Rantala
> AuthorDate: Sat, 13 Apr 2013 22:49:14 +0300
> Committer: Ingo M
On Mon, 2013-04-15 at 12:21 -0500, Jacob Shin wrote:
> Add support for AMD Family 15h [and above] northbridge performance
> counters. MSRs 0xc0010240 ~ 0xc0010247 are shared across all cores
> that share a common northbridge.
>
> Add support for AMD Family 16h L2 performance counters. MSRs
> 0xc00
On Mon, 2013-04-15 at 16:30 -0700, Andrew Morton wrote:
> I think this will break the build if CONFIG_PERF_EVENTS=n and
> CONFIG_LOCKUP_DETECTOR=y. I was able to create such a config for
> powerpc. If I'm reading it correctly, CONFIG_PERF_EVENTS cannot be
> disabled on x86_64? If so, what the he
On Tue, 2013-04-16 at 06:57 +, Pan, Zhenjie wrote:
> Watchdog use performance monitor of cpu clock cycle to generate NMI to detect
> hard lockup.
> But when cpu's frequency changes, the event period will also change.
> It's not as expected as the configration.
> For example, set the NMI event
On Fri, 2013-04-19 at 10:25 +0200, Ingo Molnar wrote:
> It might eventually make sense to integrate the 'average load'
> calculation as well
> with all this - as they really have a similar purpose, the avenload[]
> vector of
> averages is conceptually similar to the rq->cpu_load[] vector of
> ave
nly runs a periodic RT task,
> is close to LOAD_AVG_MAX whatever the running duration of the RT task is.
>
> A new idle_exit function is called when the prev task is the idle function
> so the elapsed time will be accounted as idle time in the rq's load.
Acked-by: Peter Zijlstra
On Tue, 2013-04-16 at 19:51 +0800, Yan, Zheng wrote:
> From: "Yan, Zheng"
>
> I sent these 3 patches to the mailing list some time ago, but got no response.
> Patch 1 and patch 2 are bug fixes, patch 3 adds Ivy Bridge-EP support.
Acked-by: Peter Zijlstra
--
To unsu
r overflow interrupts. Sampling mode and
> per-thread events are not supported.
>
> Signed-off-by: Jacob Shin
Acked-by: Peter Zijlstra
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More maj
Zheng
Date: Mon Sep 10 15:53:49 2012 +0800
perf/x86: Add cpumask for uncore pmu
This patch adds a cpumask file to the uncore pmu sysfs directory. The
cpumask file contains one active cpu for every socket.
Signed-off-by: "Yan, Zheng"
Acked-by: Peter Zijlst
On Sun, 2013-04-21 at 10:52 +0200, Ingo Molnar wrote:
> * George Dunlap wrote:
>
> > Any comments? it's been 2 weeks now.
>
> Looks good to me - Peter, any objections?
Nope, looks good.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to maj
McKenney
> Tested-by: Gustavo Luiz Duarte
Acked-by: Peter Zijlstra
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index b0cd865..8db9551 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4593,6 +4593,7 @@ void perf_event_comm(stru
On Fri, 2013-04-19 at 15:10 +0200, Vincent Guittot wrote:
> As suggested by Frederic Weisbecker, another solution is to have the
> same
> rcu lifecycle for both NOHZ_IDLE and sched_domain struct. I have
> introduce
> a new sched_domain_rq struct that is the entry point for both
> sched_domains
> an
OK,.. Ingo said that pipe-test was the original motivation for
wake_affine() and since that's currently broken to pieces due to
select_idle_sibling() is there still a benefit to having it at all?
Can anybody find any significant regression when simply killing
wake_affine()?
--
To unsubscribe fr
On Mon, 2013-04-22 at 13:01 +0200, Vincent Guittot wrote:
> > I'm not quite getting things.. what's wrong with adding this flags
> > thing to sched_domain itself? That's already RCU destroyed so why
> add a
> > second RCU layer?
>
> We need one flags for all sched_domain so if we add it into
> sch
On Sun, 2013-04-21 at 17:12 -0400, Rik van Riel wrote:
>
> If we always incremented the ticket number by 2 (instead of 1), then
> we could use the lower bit of the ticket number as the spinlock.
ISTR that paravirt ticket locks already do that and use the lsb to
indicate the unlock needs to perfor
On Tue, 2013-03-26 at 15:01 +0900, Joonsoo Kim wrote:
> @@ -5506,10 +5506,10 @@ static void rebalance_domains(int cpu, enum
> cpu_idle_type idle)
> if (time_after_eq(jiffies, sd->last_balance +
> interval)) {
> if (load_balance(cpu, rq, sd, idle, &balance))
>
#x27;t
> consider other cpus. Assigning to 'this_rq->idle_stamp' is now valid.
>
> Cc: Srivatsa Vaddagiri
> Acked-by: Peter Zijlstra
> Signed-off-by: Joonsoo Kim
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9d693d0..3f8c4f2 10064
On Tue, 2013-03-26 at 15:01 +0900, Joonsoo Kim wrote:
> Commit 88b8dac0 makes load_balance() consider other cpus in its group.
> But, in that, there is no code for preventing to re-select dst-cpu.
> So, same dst-cpu can be selected over and over.
>
> This patch add functionality to load_balance()
ith them taken care of.
I'll leave that up to you and Ingo.
Otherwise:
Acked-by: Peter Zijlstra
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.o
On Mon, 2013-04-22 at 08:52 -0400, Rik van Riel wrote:
> On 04/22/2013 07:51 AM, Peter Zijlstra wrote:
> > On Sun, 2013-04-21 at 17:12 -0400, Rik van Riel wrote:
> >>
> >> If we always incremented the ticket number by 2 (instead of 1), then
> >> we could use t
901 - 1000 of 24297 matches
Mail list logo