llerman
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Gautham R Shenoy
Signed-off-by: Srikar Dronamraju
---
Changelog v1:->v2:
- Rebased to v5.7-rc3
Link v2:
https://lore.kernel.org/linuxppc-dev/20200428093836.27190-1-sri...@linux.vnet.ibm.com/t/#u
mm/page_alloc.c | 4 +++-
1 file changed, 3 i
Gorman
Cc: Vlastimil Babka
Cc: "Kirill A. Shutemov"
Cc: Christopher Lameter
Cc: Michael Ellerman
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Gautham R Shenoy
Signed-off-by: Srikar Dronamraju
---
Changelog v1:->v2:
- Rebased to v5.7-rc3
arch/powerpc/mm/numa.c | 16 +++
-kernel@vger.kernel.org
Cc: Michal Hocko
Cc: Mel Gorman
Cc: Vlastimil Babka
Cc: "Kirill A. Shutemov"
Cc: Christopher Lameter
Cc: Michael Ellerman
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Gautham R Shenoy
Signed-off-by: Srikar Dronamraju
---
Changelog v2:->v3:
- Resolved
ces/system/node/possible: 0-31
Cc: linuxppc-...@lists.ozlabs.org
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: Michal Hocko
Cc: Mel Gorman
Cc: Vlastimil Babka
Cc: "Kirill A. Shutemov"
Cc: Christopher Lameter
Cc: Michael Ellerman
Cc: Andrew Morton
Cc: Linus Torv
-kernel@vger.kernel.org
Cc: Michal Hocko
Cc: Mel Gorman
Cc: Vlastimil Babka
Cc: "Kirill A. Shutemov"
Cc: Christopher Lameter
Cc: Michael Ellerman
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Srikar Dronamraju
---
Changelog v1:->v2:
- Rebased to v5.7-rc3
arch/powerpc/m
Gorman
Cc: Vlastimil Babka
Cc: "Kirill A. Shutemov"
Cc: Christopher Lameter
Cc: Michael Ellerman
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Srikar Dronamraju
---
Changelog v1:->v2:
- Rebased to v5.7-rc3
arch/powerpc/mm/numa.c | 16 ++--
1 file changed,
llerman
Cc: Andrew Morton
Cc: Linus Torvalds
Signed-off-by: Srikar Dronamraju
---
Changelog v1:->v2:
- Rebased to v5.7-rc3
- Updated the changelog.
mm/page_alloc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 69827d4fa
zero node.
3. NUMA Multi node but with CPUs and memory from node 0.
4. NUMA Multi node but with no CPUs and memory from node 0.
--
Thanks and Regards
Srikar Dronamraju
vior.
>
> Fix this by only use FOLL_SPLIT_PMD for uprobe register case.
>
> Add a WARN() to confirm uprobe unregister never work on huge pages, and
> abort the operation when this WARN() triggers.
>
> Fixes: 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLI
sk_numa_fault()
if (!priv && !local && ng && ng->active_nodes > 1 &&
numa_is_active_node(cpu_node, ng) &&
numa_is_active_node(mem_node, ng))
local = 1;
Hence all accesses will be accounted as local. Hence scanning would slow
down.
--
Thanks and Regards
Srikar Dronamraju
se above?
> if (update_cpu_associativity_changes_mask() > 0)
> - topology_schedule_update();
> + sdo = true;
> reset_topology_timer();
> }
> + if (sdo)
> + topology_schedule_update();
> + topology_scans++;
> }
Are the above two hunks necessary? Not getting how the current changes are
different from the previous.
--
Thanks and Regards
Srikar Dronamraju
gt; Suggested-by: Kees Cook
> Reviewed-by: David Windsor
> Reviewed-by: Hans Liljestrand
> Signed-off-by: Elena Reshetova
> ---
> kernel/events/uprobes.c | 8 ----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
Looks good to me.
Reviewed-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
* Vincent Guittot [2019-07-26 16:42:53]:
> On Fri, 26 Jul 2019 at 15:59, Srikar Dronamraju
> wrote:
> > > @@ -7361,19 +7357,46 @@ static int detach_tasks(struct lb_env *env)
> > > if (!can_migrate_task(p, env))
> > > goto nex
e not on the local NUMA node. Speed up
>* NUMA scanning to get the memory moved over.
>*/
> - int ratio = max(lr_ratio, ps_ratio);
> + int ratio = max(lr_ratio, sp_ratio);
> diff = -(NUMA_PERIOD_THRESHOLD - ratio) * period_slot;
> }
>
> --
> 2.20.1
>
--
Thanks and Regards
Srikar Dronamraju
task, so it could be a good one to be picked for load balancing.
No?
> /*
>* Attempt to move tasks. If find_busiest_group has found
>* an imbalance but busiest->nr_running <= 1, the group is
> --
> 2.7.4
>
--
Thanks and Regards
Srikar Dronamraju
be
>
> if (lr_ratio >= NUMA_PERIOD_THRESHOLD)
> slow down scanning
> else if (sp_ratio >= NUMA_PERIOD_THRESHOLD) {
> if (NUMA_PERIOD_SLOTS - lr_ratio >= NUMA_PERIOD_THRESHOLD)
> speed up scanning
> else
> slow down scanning
> } else
> speed up scanning
>
> This follows your idea better?
>
> Best Regards,
> Huang, Ying
--
Thanks and Regards
Srikar Dronamraju
;
> }
>
> /*
> - * if *imbalance is less than the average load per runnable task
> - * there is no guarantee that any tasks will be moved so we'll have
> - * a think about bumping its value to force at least one task to be
> - * moved
> + * Both group are or will become overloaded and we're trying to get
> + * all the CPUs to the average_load, so we don't want to push
> + * ourselves above the average load, nor do we wish to reduce the
> + * max loaded CPU below the average load. At the same time, we also
> + * don't want to reduce the group load below the group capacity.
> + * Thus we look for the minimum possible imbalance.
>*/
> - if (env->imbalance < busiest->load_per_task)
> - return fix_small_imbalance(env, sds);
> + env->src_grp_type = migrate_load;
> + env->imbalance = min(
> + (busiest->avg_load - sds->avg_load) * busiest->group_capacity,
> + (sds->avg_load - local->avg_load) * local->group_capacity
> + ) / SCHED_CAPACITY_SCALE;
> }
We calculated avg_load for !group_overloaded case, but seem to be using for
group_overloaded cases too.
--
Thanks and Regards
Srikar Dronamraju
erov
> Signed-off-by: Song Liu
Looks good to me.
Reviewed-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
| 6 ++
> 1 file changed, 2 insertions(+), 4 deletions(-)
Looks good to me.
Reviewed-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
This patch allows uprobe to use original page when possible (all uprobes
> on the page are already removed).
>
> Acked-by: Kirill A. Shutemov
> Signed-off-by: Song Liu
>
Looks good to me.
Reviewed-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
+ }
> +#endif
> + return nr_cpu_ids;
> +}
> +
Should we have a static function for sched_numa_find_closest instead of
having #ifdef in the function?
> static int __sdt_alloc(const struct cpumask *cpu_map)
> {
> struct sched_domain_topology_level *tl;
--
Thanks and Regards
Srikar Dronamraju
ONFIG_NUMA */
> +
> static int __sdt_alloc(const struct cpumask *cpu_map)
> {
> struct sched_domain_topology_level *tl;
>
Looks good to me.
Reviewed-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
nment happes for idle_cpu and sometimes the
assignment is for non-idle cpu.
> if (!--nr)
> return -1;
> if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
> --
> 2.9.3
>
--
Thanks and Regards
Srikar Dronamraju
l, which may
be beneficial.
> atomic_inc(&sg->ref);
> return sg;
> }
> --
> 2.22.0
>
--
Thanks and Regards
Srikar Dronamraju
* Peter Zijlstra [2019-07-08 12:23:12]:
> On Sat, Jul 06, 2019 at 10:52:23PM +0530, Srikar Dronamraju wrote:
> > * Markus Elfring [2019-07-06 16:05:17]:
> >
> > > From: Markus Elfring
> > > Date: Sat, 6 Jul 2019 16:00:13 +0200
> > >
> > &
th and without patch) the generated code differs.
> --mtx
>
> --
> Enrico Weigelt, metux IT consult
> Free software and Linux embedded engineering
> i...@metux.net -- +49-151-27565287
>
--
Thanks and Regards
Srikar Dronamraju
: Srikar Dronamraju
---
Documentation/trace/uprobetracer.rst | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/Documentation/trace/uprobetracer.rst
b/Documentation/trace/uprobetracer.rst
index 4c3bfde2..4346e23 100644
--- a/Documentation/trace/uprobetracer.rst
+++ b
* Mel Gorman [2018-09-10 10:41:47]:
> On Fri, Sep 07, 2018 at 01:37:39PM +0100, Mel Gorman wrote:
> > > Srikar's patch here:
> > >
> > >
> > > http://lkml.kernel.org/r/1533276841-16341-4-git-send-email-sri...@linux.vnet.ibm.com
> > >
> > > Also frobs this condition, but in a less radical way
> > Running SPECJbb2005. Higher bops are better.
> >
> > Kernel A = 4.18+ 13 sched patches part of v4.19-rc1.
> > Kernel B = Kernel A + 6 patches
> > (http://lore.kernel.org/lkml/1533276841-16341-1-git-send-email-sri...@linux.vnet.ibm.com)
> > Kernel C = Kernel B - (Avoid task migration for small
* Peter Zijlstra [2018-09-12 11:36:21]:
> On Wed, Sep 12, 2018 at 12:24:10PM +0530, Srikar Dronamraju wrote:
>
> > Kernel A = 4.18+ 13 sched patches part of v4.19-rc1.
> > Kernel B = Kernel A + 6 patches
> > (http://lore.kernel.org/lkml/1533276841-16
> >
> > /*
> > + * Maximum numa importance can be 1998 (2*999);
> > + * SMALLIMP @ 30 would be close to 1998/64.
> > + * Used to deter task migration.
> > + */
> > +#define SMALLIMP 30
> > +
> > +/*
> >
> > /*
> > +* If the numa importance is less than SMALLIMP,
> > +* task migra
> > +#ifdef CONFIG_NUMA_BALANCING
> > + if (!p->mm || (p->flags & PF_EXITING))
> > + return;
> > +
> > + if (p->numa_faults) {
> > + int src_nid = cpu_to_node(task_cpu(p));
> > + int dst_nid = cpu_to_node(new_cpu);
> > +
> > + if (src_nid != dst_nid)
> >
core will complete more work than SMT 4 core
threads.
--
Thanks and Regards
Srikar Dronamraju
* Vincent Guittot [2018-09-05 11:11:35]:
> On Wed, 5 Sep 2018 at 10:50, Srikar Dronamraju
> wrote:
> >
> > * Vincent Guittot [2018-09-05 09:36:42]:
> >
> > > >
> > > > I dont know of any systems that have come with single threaded and
> >
Power8 (4 node, 16 node), 2 node Power 9, 2 node skylake
and 4 node Power 7. Surely I will keep you informed and eager to know the
results of your experiments.
--
Thanks and Regards
Srikar Dronamraju
re seeing Thread B; skips and looses
an opportunity to swap.
Eventually thread B will get an opportunity to move to node 0, when thread B
calls task_numa_placement but we are probably stopping it from achieving
earlier.
--
Thanks and Regards
Srikar Dronamraju
t_imp)
+ if (maymove && moveimp >= env->best_imp)
goto assign;
else
In Mel's fix, if we already found a candidate task to swap and then encounter a
idle cpu, we are going ahead and overwriting the swap candidate. There is
always a chance that swap candidate is a better fit than moving to idle cpu.
In the patch which is in your queue, we are saying move only if it is better
than
swap candidate. So this is noway less radical than Mel's patch and probably
more correct.
--
Thanks and Regards
Srikar Dronamraju
>
* Peter Zijlstra [2018-09-07 15:19:23]:
> On Fri, Sep 07, 2018 at 06:26:49PM +0530, Srikar Dronamraju wrote:
>
> > Can you please pick
> >
> >
> > 1. 69bb3230297e881c797bbc4b3dbf73514078bc9d sched/numa: Stop multiple tasks
> > from movi
0.00%)31293.06 ( 3.91%)
> MB/sec add 32825.12 ( 0.00%)34883.62 ( 6.27%)
> MB/sec triad 32549.52 ( 0.00%)34906.60 ( 7.24%
>
> Signed-off-by: Mel Gorman
> ---
Reviewed-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
tch_pid(p, last_cpupid)))
return true;
meaning, we ran atleast MIN number of scans, and we find the task to be most
likely
task using this page.
--
Thanks and Regards
Srikar Dronamraju
return false;
to:
if (!cpupid_pid_unset(last_cpupid) &&
cpupid_to_nid(last_cpupid) == dst_nid)
return true;
i.e to say if the group tasks likely consolidated to a node or the task was
moved to a different node but access were private, just move the memory.
The drawback though is we keep pulling memory everytime the task moves
across nodes. (which is probably restricted for long running tasks to some
extent by your fix)
--
Thanks and Regards
Srikar Dronamraju
* Srikar Dronamraju [2018-10-02 23:00:05]:
> I will try to get a DayTrader run in a day or two. There JVM and db threads
> act on the same memory, I presume it might show some insights.
I ran 2 runs of daytrader 7 with and without patch on a 2 node power9
PowerNv box.
https://github.com/
ere tasks are pulled frequently cross-node (e.g. worker thread
> model or a pipelined computation).
>
> I'm only looking to address the case where the load balancer spreads a
> workload early and the memory should move to the new node quickly. If it
> turns out there are cases where that decision is wrong, it gets remedied
> quickly but if your proposal is ever wrong, the system doesn't recover.
>
Agree.
--
Thanks and Regards
Srikar Dronamraju
* Mel Gorman [2018-10-03 14:21:55]:
> On Wed, Oct 03, 2018 at 06:37:41PM +0530, Srikar Dronamraju wrote:
> > * Srikar Dronamraju [2018-10-02 23:00:05]:
> >
>
> That's unfortunate.
>
> How much does this workload normally vary between runs? If you monitor
>
Gorman (1):
sched/numa: Limit the conditions where scan period is reset
Srikar Dronamraju (5):
sched/numa: Stop multiple tasks from moving to the CPU at the same
time
sched/numa: Pass destination CPU as a parameter to migrate_task_rq
sched/numa: Reset scan rate whenever task moves across
numa_interleave 0 0
numa_local 35661 41383
numa_other 0 1
numa_pages_migrated 568 815
numa_pte_updates651811323
Acked-by: Mel Gorman
Reviewed-by: Rik van Riel
Signed-off-by: Srikar Dronamraju
---
Changelog
Add comments as requested by Peter
1 0
numa_pages_migrated 815 706
numa_pte_updates11323 10176
Signed-off-by: Srikar Dronamraju
---
Changelog v1->v2:
Rename cpu to CPU
Rename numa to NUMA
kernel/sched/core.c | 2 +-
kernel/sched/deadline.c | 2 +-
kernel/sched/fair.c | 2 +-
kernel/sc
0
numa_local 36338 35526
numa_other 0 0
numa_pages_migrated 706 539
numa_pte_updates10176 8433
Signed-off-by: Srikar Dronamraju
---
Changelog v1->v2:
Rename cpu to CPU
Rename numa to NUMA
kernel/sched/fair.c |
numa_hit36230 36470
numa_huge_pte_updates 0 0
numa_interleave 0 0
numa_local 36228 36465
numa_other 2 5
numa_pages_migrated 703 726
numa_pte_updates14742 11930
Signed-off-by: Srikar Dronamraju
---
Changelog
-by: Srikar Dronamraju
Suggested-by: Peter Zijlstra
---
Changelog:
Fix stretch every interval pointed by Peter Zijlstra.
mm/migrate.c | 16
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index d6a2e89..4f1d894 100644
--- a/mm/migrate.c
numa_pages_migrated 539 616
numa_pte_updates843313374
Signed-off-by: Mel Gorman
Signed-off-by: Srikar Dronamraju
---
Changelog v1->v2:
Rename numa to NUMA
kernel/sched/fair.c | 25 +++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --gi
> With :
> commit 2d4056fafa19 ("sched/numa: Remove numa_has_capacity()")
>
> the local variables smt, cpus and capacity and their results are not used
> anymore in numa_has_capacity()
>
> Remove this unused code
>
> Cc: Peter Zijlstra
> Cc: Ingo Mo
* Vincent Guittot [2018-08-29 15:19:10]:
> nr_running in struct numa_stats is not used anywhere in the code.
>
> Remove it.
>
> Cc: Peter Zijlstra
> Cc: Ingo Molnar
> Cc: Srikar Dronamraju
> Cc: Rik van Riel
> Cc: linux-kernel@vger.kernel.org (open list)
>
is way capacity might actually
be more than the capacity_orig. I am always under an impression that
capacity_orig > capacity. Or am I misunderstanding that?
--
Thanks and Regards
Srikar Dronamraju
* Vincent Guittot [2018-09-04 11:36:26]:
> Hi Srikar,
>
> Le Tuesday 04 Sep 2018 à 01:24:24 (-0700), Srikar Dronamraju a écrit :
> > However after this change, capacity_orig of each SMT thread would be
> > 1024. For example SMT 8 core capacity_orig would now be 8192.
* Mel Gorman [2018-05-09 09:41:48]:
> On Mon, May 07, 2018 at 04:06:07AM -0700, Srikar Dronamraju wrote:
> > > @@ -1876,7 +1877,18 @@ static void numa_migrate_preferred(struct
> > > task_struct *p)
> > >
> > > /* Periodically retry migrating the task to
not illogical and was worth attempting to fix. However,
> the approach was wrong. Given that we're at rc4 and a fix is not obvious,
> it's better to play safe, revert this commit and retry later.
>
> Signed-off-by: Mel Gorman
Reviewed-by: Srikar Dronamraju
Hi Mel,
I do see performance improving with this commit 7347fc87df "sched/numa:
Delay retrying placement for automatic NUMA balance after wake_affine()"
even on powerpc where we have SD_WAKE_AFFINE *disabled* on numa sched
domains. Ideally this commit should not have affected powerpc machines.
Tha
these replaced pages happen to be physically
contiguous so that THP kicks in to replace all of these pages with one
THP page. Can happen in practice?
Are there any other cases that I have missed?
--
Thanks and Regards
Srikar Dronamraju
backed text and
> very little iTLB pressure :-)
>
> That said, we haven't run into the uprobes issue yet.
>
Thanks Johannes, Kirill, Rik.
Reviewed-by: Srikar Dronamraju
ert __replace_page() to use page_check_walk()" ?
Otherwise looks good to me.
Reviewed-by: Srikar Dronamraju
number of tasks than the local group.
Signed-off-by: Srikar Dronamraju
---
Here are the relevant perf stat numbers of a 22 core,smt 8 Power 8 machine.
Without patch:
Performance counter stats for 'ebizzy -t 22 -S 100' (5 runs):
1,440 context-switches #0
> Added more people to the CC list.
>
> Em Wed, Mar 15, 2017 at 05:58:19PM -0700, Alexei Starovoitov escreveu:
> > On Thu, Feb 16, 2017 at 05:00:50PM +1100, Anton Blanchard wrote:
> > > We have uses of CONFIG_UPROBE_EVENT and CONFIG_KPROBE_EVENT as
> > > well as CONFIG_UPROBE_EVENTS and CONFIG_KPR
11.38% _find_next_bit.part.0
3.06% find_next_bit
- 11.03% 0.00% [kernel] [k] zone_balanced
- zone_balanced
- zone_watermark_ok_safe
145.84% zone_watermark_ok_safe
11.31% _find_next_bit.part.0
3.04% find_next_bit
--
Thanks and Regards
Srikar Dronamraju
31220 pages used for memmap
DMA zone: 0 pages reserved
DMA zone: 31969280 pages, LIFO batch:1
--
Thanks and Regards
Srikar Dronamraju
probe_events
> > -bash: echo: write error: No such file or directory
> >
> > Signed-off-by: Dmitry Safonov
>
> Acked-by: Steven Rostedt
Agree.
Acked-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
MaxMedian AvgStddev
x 5 37102220 42736809 38442478 39529626 2298389.4
Signed-off-by: Srikar Dronamraju
---
kernel/sched/fair.c | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel
.
Resetting the nr_balance_failed after a successful active balance, ensures
that a hot task is not unreasonably migrated. This can be verified by
looking at number of hot task migrations reported by /proc/schedstat.
Signed-off-by: Srikar Dronamraju
---
kernel/sched/fair.c | 12 ++--
1 file
Having the numa group id in /proc/sched_debug helps to see how the numa
groups have spread across the system.
Signed-off-by: Srikar Dronamraju
---
kernel/sched/debug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 315c68e
o have absolute numbers since
differential migrations between two accesses can be easily calculated.
Signed-off-by: Srikar Dronamraju
---
kernel/sched/debug.c | 38 +-
kernel/sched/fair.c | 22 +-
kernel/sched/sched.h | 10 +-
3
e=1344
group_shared=0
numa_faults node=1 task_private=641 task_shared=65 group_private=641
group_shared=65
numa_faults node=2 task_private=512 task_shared=0 group_private=512
group_shared=0
numa_faults node=3 task_private=64 task_shared=1 group_private=64 group_shared=1
Srikar Dronamraju (3):
Currently print_cfs_rq() is declared in include/linux/sched.h.
However its not used outside kernel/sched. Hence move the declaration to
kernel/sched/sched.h
Also some functions are only available for CONFIG_SCHED_DEBUG. Hence move
the declarations within #ifdef.
Signed-off-by: Srikar Dronamraju
Variable sched_numa_balancing is available for both CONFIG_SCHED_DEBUG
and !CONFIG_SCHED_DEBUG. All code paths now check for
sched_numa_balancing. Hence remove sched_feat(NUMA).
Suggested-by: Ingo Molnar
Signed-off-by: Srikar Dronamraju
---
kernel/sched/core.c | 6 --
kernel/sched
Simple rename of numabalancing_enabled variable to sched_numa_balancing.
No functional changes.
Suggested-by: Ingo Molnar
Signed-off-by: Srikar Dronamraju
---
kernel/sched/core.c | 6 +++---
kernel/sched/fair.c | 4 ++--
kernel/sched/sched.h | 6 +++---
3 files changed, 8 insertions(+), 8
his commit
- Makes sched_numa_balancing common to CONFIG_SCHED_DEBUG and
!CONFIG_SCHED_DEBUG. Earlier it was only in !CONFIG_SCHED_DEBUG
- Checks for sched_numa_balancing instead of sched_feat(NUMA)
Signed-off-by: Srikar Dronamraju
---
kernel/sched/core.c | 14 +-
kernel/sched/fa
ing.
This patchset
- Renames numabalancing_enabled to sched_numa_balancing
- Makes sched_numa_balancing common to CONFIG_SCHED_DEBUG and
!CONFIG_SCHED_DEBUG. Earlier it was only in !CONFIG_SCHED_DEBUG
- Checks for sched_numa_balancing instead of sched_feat(NUMA)
- Removes NUMA sched feature
Srikar Dro
Variable sched_numa_balancing toggles numa_balancing feature. Hence
moving from a simple read mostly variable to a more apt static_branch.
Suggested-by: Peter Zijlstra
Signed-off-by: Srikar Dronamraju
---
kernel/sched/core.c | 10 +++---
kernel/sched/fair.c | 6 +++---
kernel/sched
* tip-bot for Srikar Dronamraju [2015-07-06 08:50:28]:
> Commit-ID: 8a9e62a238a3033158e0084d8df42ea116d69ce1
> Gitweb: http://git.kernel.org/tip/8a9e62a238a3033158e0084d8df42ea116d69ce1
> Author: Srikar Dronamraju
> AuthorDate: Tue, 16 Jun 2015 17:25:59 +0530
> Committe
his commit
- Renames numabalancing_enabled to sched_numa_balancing
- Makes sched_numa_balancing common to CONFIG_SCHED_DEBUG and
!CONFIG_SCHED_DEBUG. Earlier it was only in !CONFIG_SCHED_DEBUG
- Checks for sched_numa_balancing instead of sched_feat(NUMA)
Signed-off-by: Srikar Dronamraju
---
ke
truct return_instance" for the architectures which
> want to override this hook. We can also cleanup prepare_uretprobe() if
> we pass the new return_instance to arch_uretprobe_hijack_return_addr().
>
> Signed-off-by: Oleg Nesterov
Looks good to me.
Acked-by: Srikar Dronamraju
. We will try to improve this logic later.
>
> Signed-off-by: Oleg Nesterov
Looks good to me.
Acked-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message
.
>
> Note: this change has no effect on !x86, the arch-agnostic version of
> arch_uretprobe_is_alive() just returns "true".
>
> TODO: as documented by the previous change, arch_uretprobe_is_alive()
> can be fooled by sigaltstack/etc.
>
> Signed-off-by: Oleg Ne
alive() can be false positive,
> the stack can grow after longjmp(). Unfortunately, the kernel can't
> 100% solve this problem, but see the next patch.
>
> Signed-off-by: Oleg Nesterov
Looks good to me.
Acked-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
--
To
e == T" positives.
>
> Signed-off-by: Oleg Nesterov
Looks good to me.
Acked-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
lers.
>
> Signed-off-by: Oleg Nesterov
Looks good to me.
Acked-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at
JUMP_LABEL */
#ifdef CONFIG_NUMA_BALANCING
-#define sched_feat_numa(x) sched_feat(x)
-#ifdef CONFIG_SCHED_DEBUG
-#define numabalancing_enabled sched_feat_numa(NUMA)
-#else
extern bool numabalancing_enabled;
-#endif /* CONFIG_SCHED_DEBUG */
#else
-#define sched_feat_numa(x) (0)
#define numabalancin
tries to enable
kernel sleep profiling profile_setup(), the kernel may not be able to do
the right profiling since enqueue_sleeper() may not get called. Should
we alert the user saying kernel sleep profiling is disabled?
--
Thanks and Regards
Srikar Dronamraju
e1160e5..51369697466e 100644
> --- a/kernel/profile.c
> +++ b/kernel/profile.c
> @@ -59,6 +59,7 @@ int profile_setup(char *str)
>
> if (!strncmp(str, sleepstr, strlen(sleepstr))) {
> #ifdef CONFIG_SCHEDSTATS
> + force_schedstat_enabled();
> prof_on = SLEEP_PROFILING;
> if (str[strlen(sleepstr)] == ',')
> str += strlen(sleepstr) + 1;
--
Thanks and Regards
Srikar Dronamraju
it a pretty unique gup caller. Being an instruction access
> and also really originating from the kernel (vs. the app), I opted
> to consider this a 'foreign' access where protection keys will not
> be enforced.
>
Changes for uprobes.c looks good to me.
Acked-by: Srikar Dron
processors.
>
>
> Signed-off-by: Mel Gorman
> Reviewed-by: Matt Fleming
Reviewed-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
IC 0x03 -> Node 0
> > [0.009731] ACPI: SRAT: Node 0 PXM 1 [mem 0x-0x0009]
> > [0.009732] ACPI: SRAT: Node 0 PXM 1 [mem 0x0010-0xbfff]
> > [0.009733] ACPI: SRAT: Node 0 PXM 1 [mem 0x10000-0x13fff]
>
> This begs a question whether ppc can do the same thing?
Certainly ppc can be made to adapt to this situation but that would be a
workaround. Do we have a reason why we think node 0 is unique and special?
If yes can we document it so that in future also people know why we consider
node 0 to be special. I do understand the *fear of the unknown* but when we
are unable to theoretically or practically come up a case, then it may
probably be better we hit the situation to understand what that unknown is?
> I would swear that we've had x86 system with node 0 but I cannot really
> find it and it is possible that it was not x86 after all...
--
Thanks and Regards
Srikar Dronamraju
* Michal Hocko [2020-07-02 10:41:23]:
> On Thu 02-07-20 12:14:08, Srikar Dronamraju wrote:
> > * Michal Hocko [2020-07-01 14:21:10]:
> >
> > > > >>>>> The autonuma problem sounds interesting but again this patch
> > > > >>>>>
node was already used by the renumbered one
> though. It would likely conflate the two I am afraid. But I am not sure
> this is really possible with x86 and a lack of a bug report would
> suggest that nobody is doing that at least.
>
JFYI,
Satheesh copied in this mailchain had opened a bug a year on crash with vcpu
hotplug on memoryless node.
https://bugzilla.kernel.org/show_bug.cgi?id=202187
--
Thanks and Regards
Srikar Dronamraju
* Christopher Lameter [2020-06-29 14:58:40]:
> On Wed, 24 Jun 2020, Srikar Dronamraju wrote:
>
> > Currently Linux kernel with CONFIG_NUMA on a system with multiple
> > possible nodes, marks node 0 as online at boot. However in practice,
> > there are systems which h
oves the tertiary condition added as part of that
> commit and added a check for NULL and -EAGAIN.
>
> Fixes: 2ed6edd33a21("perf: Add cond_resched() to task_function_call()")
> Signed-off-by: Kajol Jain
> Reported-by: Srikar Dronamraju
Tested-by: Srikar Dronamraju
--
Thanks and Regards
Srikar Dronamraju
* Qian Cai [2020-10-07 09:05:42]:
Hi Qian,
Thanks for testing and reporting the failure.
> On Mon, 2020-09-21 at 15:26 +0530, Srikar Dronamraju wrote:
> > All threads of a SMT4 core can either be part of this CPU's l2-cache
> > mask or not related to this CPU l2-cache mas
All the arch specific topology cpumasks are within a node/DIE.
However when setting these per CPU cpumasks, system traverses through
all the online CPUs. This is redundant.
Reduce the traversal to only CPUs that are online in the node to which
the CPU belongs to.
Signed-off-by: Srikar Dronamraju
On Power, cpu_core_mask and cpu_cpu_mask refer to the same set of CPUs.
cpu_cpu_mask is needed by scheduler, hence look at deprecating
cpu_core_mask. Before deleting the cpu_core_mask, ensure its only user
is moved to cpu_cpu_mask.
Signed-off-by: Srikar Dronamraju
Tested-by: Satheesh Rajendran
Piggin
Cc: Anton Blanchard
Cc: Oliver O'Halloran
Cc: Nathan Lynch
Cc: Michael Neuling
Cc: Gautham R Shenoy
Cc: Satheesh Rajendran
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Valentin Schneider
Cc: Qian Cai
Srikar Dronamraju (11):
powerpc/topology: Update topology_core_cpumask
p
Now that cpu_core_mask has been removed and topology_core_cpumask has
been updated to use cpu_cpu_mask, we no more need
get_physical_package_id.
Signed-off-by: Srikar Dronamraju
Tested-by: Satheesh Rajendran
Cc: linuxppc-dev
Cc: LKML
Cc: Michael Ellerman
Cc: Nicholas Piggin
Cc: Anton
601 - 700 of 881 matches
Mail list logo