[tip: sched/core] sched/fair: Eliminate bandwidth race between throttling and distribution

2020-05-01 Thread tip-bot2 for Paul Turner
The following commit has been merged into the sched/core branch of tip: Commit-ID: e98fa02c4f2ea4991dae422ac7e34d102d2f0599 Gitweb: https://git.kernel.org/tip/e98fa02c4f2ea4991dae422ac7e34d102d2f0599 Author:Paul Turner AuthorDate:Fri, 10 Apr 2020 15:52:07 -07:00 Committer

Re: [RFC] x86: Speculative execution warnings

2019-05-14 Thread Paul Turner
From: Nadav Amit Date: Fri, May 10, 2019 at 7:45 PM To: Cc: Borislav Petkov, , Nadav Amit, Andy Lutomirsky, Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Jann Horn > It may be useful to check in runtime whether certain assertions are > violated even during speculative execution. This can allow t

Re: [PATCH] x86/retpoline: Avoid return buffer underflows on context switch

2018-01-08 Thread Paul Turner
On Mon, Jan 8, 2018 at 4:48 PM, David Woodhouse wrote: > On Tue, 2018-01-09 at 00:44 +, Woodhouse, David wrote: >> On IRC, Arjan assures me that 'pause' here really is sufficient as a >> speculation trap. If we do end up returning back here as a >> misprediction, that 'pause' will stop the spe

Re: [PATCH v6 11/10] x86/retpoline: Avoid return buffer underflows on context switch II

2018-01-08 Thread Paul Turner
On Mon, Jan 8, 2018 at 5:21 PM, Andi Kleen wrote: > On Mon, Jan 08, 2018 at 05:16:02PM -0800, Andi Kleen wrote: >> > If we clear the registers, what the hell are you going to put in the >> > RSB that helps you? >> >> RSB allows you to control chains of gadgets. > > I admit the gadget thing is a bi

Re: [PATCH] x86/retpoline: Avoid return buffer underflows on context switch

2018-01-08 Thread Paul Turner
On Mon, Jan 8, 2018 at 2:25 PM, Andi Kleen wrote: >> So pjt did alignment, a single unroll and per discussion earlier today >> (CET) or late last night (PST), he only does 16. > > I used the Intel recommended sequence, which recommends 32. > > Not sure if alignment makes a difference. I can check.

Re: [PATCH] x86/retpoline: Avoid return buffer underflows on context switch

2018-01-08 Thread Paul Turner
On Mon, Jan 8, 2018 at 2:11 PM, Peter Zijlstra wrote: > On Mon, Jan 08, 2018 at 12:15:31PM -0800, Andi Kleen wrote: >> diff --git a/arch/x86/include/asm/nospec-branch.h >> b/arch/x86/include/asm/nospec-branch.h >> index b8c8eeacb4be..e84e231248c2 100644 >> --- a/arch/x86/include/asm/nospec-branch

Re: [PATCH v6 00/10] Retpoline: Avoid speculative indirect calls in kernel

2018-01-08 Thread Paul Turner
ing a binary, or a new AMD processor, 32 calls are required. I would suggest tuning this based on the current CPU (which also covers the future case while saving cycles now) to save overhead. On Mon, Jan 8, 2018 at 3:16 AM, Andrew Cooper wrote: > On 08/01/18 10:42, Paul Turner wrote: >>

Re: [PATCH v6 00/10] Retpoline: Avoid speculative indirect calls in kernel

2018-01-08 Thread Paul Turner
On Mon, Jan 8, 2018 at 2:45 AM, David Woodhouse wrote: > On Mon, 2018-01-08 at 02:34 -0800, Paul Turner wrote: >> One detail that is missing is that we still need RSB refill in some >> cases. >> This is not because the retpoline sequence itself will underflow (it >> is

Re: [PATCH v6 00/10] Retpoline: Avoid speculative indirect calls in kernel

2018-01-08 Thread Paul Turner
On Mon, Jan 8, 2018 at 2:38 AM, Jiri Kosina wrote: > On Mon, 8 Jan 2018, Paul Turner wrote: > >> user->kernel in the absence of SMEP: >> In the absence of SMEP, we must worry about user-generated RSB entries >> being consumable by kernel execution. >> Generally sp

Re: [PATCH v6 00/10] Retpoline: Avoid speculative indirect calls in kernel

2018-01-08 Thread Paul Turner
tem it took ~43 cycles on average. Note that non-zero displacement calls should be used as these may be optimized to not interact with the RSB due to their use in fetching RIP for 32-bit relocations. On Mon, Jan 8, 2018 at 2:34 AM, Paul Turner wrote: > One detail that is missing is that we sti

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Fri, Jan 5, 2018 at 3:26 AM, Paolo Bonzini wrote: > On 05/01/2018 11:28, Paul Turner wrote: >> >> The "pause; jmp" sequence proved minutely faster than "lfence;jmp" which is >> why >> it was chosen. >> >> "pause; jmp" 3

Re: [PATCH 0/7] IBRS patch series

2018-01-05 Thread Paul Turner
On Fri, Jan 5, 2018 at 3:32 AM, Paolo Bonzini wrote: > On 04/01/2018 22:22, Van De Ven, Arjan wrote: >> this is about a level of paranoia you are comfortable with. >> >> Retpoline on Skylake raises the bar for the issue enormously, but >> there are a set of corner cases that exist and that are not

Re: [PATCH 0/7] IBRS patch series

2018-01-05 Thread Paul Turner
On Thu, Jan 4, 2018 at 11:33 AM, Linus Torvalds wrote: > On Thu, Jan 4, 2018 at 11:19 AM, David Woodhouse wrote: >> >> On Skylake the target for a 'ret' instruction may also come from the >> BTB. So if you ever let the RSB (which remembers where the 'call's came >> from get empty, you end up vuln

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Fri, Jan 05, 2018 at 10:55:38AM +, David Woodhouse wrote: > On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote: > > On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote: > > > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote: > >

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Fri, Jan 05, 2018 at 10:55:38AM +, David Woodhouse wrote: > On Fri, 2018-01-05 at 02:28 -0800, Paul Turner wrote: > > On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote: > > > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote: > >

Re: [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre")

2018-01-05 Thread Paul Turner
On Thu, Jan 04, 2018 at 08:18:57AM -0800, Andy Lutomirski wrote: > On Thu, Jan 4, 2018 at 1:30 AM, Woodhouse, David wrote: > > On Thu, 2018-01-04 at 01:10 -0800, Paul Turner wrote: > >> Apologies for the discombobulation around today's disclosure. Obviously > >&g

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Thu, Jan 04, 2018 at 10:25:35AM -0800, Linus Torvalds wrote: > On Thu, Jan 4, 2018 at 10:17 AM, Alexei Starovoitov > wrote: > > > > Clearly Paul's approach to retpoline without lfence is faster. Using pause rather than lfence does not represent a fundamental difference here. A protected indir

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Thu, Jan 04, 2018 at 10:40:23AM -0800, Andi Kleen wrote: > > Clearly Paul's approach to retpoline without lfence is faster. > > I'm guessing it wasn't shared with amazon/intel until now and > > this set of patches going to adopt it, right? > > > > Paul, could you share a link to a set of altern

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-05 Thread Paul Turner
On Thu, Jan 04, 2018 at 07:27:58PM +, David Woodhouse wrote: > On Thu, 2018-01-04 at 10:36 -0800, Alexei Starovoitov wrote: > > > > Pretty much. > > Paul's writeup: https://support.google.com/faqs/answer/7625886 > > tldr: jmp *%r11 gets converted to: > > call set_up_target; > > capture_spec: >

Re: [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre")

2018-01-04 Thread Paul Turner
On Thu, Jan 4, 2018 at 1:10 AM, Paul Turner wrote: > Apologies for the discombobulation around today's disclosure. Obviously the > original goal was to communicate this a little more coherently, but the > unscheduled advances in the disclosure disrupted the efforts to pull this &

Re: [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre")

2018-01-04 Thread Paul Turner
On Thu, Jan 4, 2018 at 1:10 AM, Paul Turner wrote: > Apologies for the discombobulation around today's disclosure. Obviously the > original goal was to communicate this a little more coherently, but the > unscheduled advances in the disclosure disrupted the efforts to pull this &

[RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre")

2018-01-04 Thread Paul Turner
Apologies for the discombobulation around today's disclosure. Obviously the original goal was to communicate this a little more coherently, but the unscheduled advances in the disclosure disrupted the efforts to pull this together more cleanly. I wanted to open discussion the "retpoline" approach

Re: Avoid speculative indirect calls in kernel

2018-01-03 Thread Paul Turner
On Wed, Jan 3, 2018 at 3:51 PM, Linus Torvalds wrote: > On Wed, Jan 3, 2018 at 3:09 PM, Andi Kleen wrote: >> This is a fix for Variant 2 in >> https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html >> >> Any speculative indirect calls in the kernel can be tricked

Re: [RFC PATCH for 4.15 00/24] Restartable sequences and CPU op vector v11

2017-11-14 Thread Paul Turner
I have some comments that apply to many of the threads. I've been fully occupied with a wedding and a security issue; but I'm about to be free to spend the majority of my time on RSEQ things. I was sorely hoping that day would be today. But it's looking like I'm still a day or two from being free

Re: [RESEND PATCH 2/2] sched/fair: Optimize __update_sched_avg()

2017-03-31 Thread Paul Turner
On Thu, Mar 30, 2017 at 7:14 AM, Peter Zijlstra wrote: > On Thu, Mar 30, 2017 at 02:16:58PM +0200, Peter Zijlstra wrote: >> On Thu, Mar 30, 2017 at 04:21:08AM -0700, Paul Turner wrote: > >> > > + >> > > + if (unlikely(periods >= LOAD_AVG_MAX_N)) >

Re: [RESEND PATCH 2/2] sched/fair: Optimize __update_sched_avg()

2017-03-31 Thread Paul Turner
On Fri, Mar 31, 2017 at 12:01 AM, Peter Zijlstra wrote: > On Thu, Mar 30, 2017 at 03:02:47PM -0700, Paul Turner wrote: >> On Thu, Mar 30, 2017 at 7:14 AM, Peter Zijlstra wrote: >> > On Thu, Mar 30, 2017 at 02:16:58PM +0200, Peter Zijlstra wrote: >> >> On Thu, Ma

Re: [RESEND PATCH 2/2] sched/fair: Optimize __update_sched_avg()

2017-03-30 Thread Paul Turner
On Thu, Mar 30, 2017 at 7:14 AM, Peter Zijlstra wrote: > On Thu, Mar 30, 2017 at 02:16:58PM +0200, Peter Zijlstra wrote: >> On Thu, Mar 30, 2017 at 04:21:08AM -0700, Paul Turner wrote: > >> > > + >> > > + if (unlikely(periods >= LOAD_AVG_MAX_N)) >

Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

2017-03-30 Thread Paul Turner
On Mon, Mar 20, 2017 at 11:08 AM, Patrick Bellasi wrote: > On 20-Mar 13:15, Tejun Heo wrote: >> Hello, >> >> On Tue, Feb 28, 2017 at 02:38:38PM +, Patrick Bellasi wrote: >> > This patch extends the CPU controller by adding a couple of new >> > attributes, capacity_min and capacity_max, which c

Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

2017-03-30 Thread Paul Turner
There is one important, fundamental difference here: {cfs,rt}_{period,runtime}_us is a property that applies to a group of threads, it can be sub-divided. We can consume 100ms of quota either by having one thread run for 100ms, or 2 threads running for 50ms. This is not true for capacity. It's a

Re: [RESEND PATCH 2/2] sched/fair: Optimize __update_sched_avg()

2017-03-30 Thread Paul Turner
> --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2767,7 +2767,7 @@ static const u32 __accumulated_sum_N32[] > * Approximate: > * val * y^n,where y^32 ~= 0.5 (~1 scheduling period) > */ > -static __always_inline u64 decay_load(u64 val, u64 n) > +static u64 decay_load(u64 val

Re: [PATCHSET for-4.11] cgroup: implement cgroup v2 thread mode

2017-02-09 Thread Paul Turner
On Thu, Feb 2, 2017 at 12:06 PM, Tejun Heo wrote: > Hello, > > This patchset implements cgroup v2 thread mode. It is largely based > on the discussions that we had at the plumbers last year. Here's the > rough outline. > > * Thread mode is explicitly enabled on a cgroup by writing "enable" > i

Re: [PATCH] sched/fair: fix calc_cfs_shares fixed point arithmetics

2016-12-19 Thread Paul Turner
On Mon, Dec 19, 2016 at 3:29 PM, Samuel Thibault wrote: > Paul Turner, on Mon 19 Dec 2016 15:26:19 -0800, wrote: >> >> > - if (shares < MIN_SHARES) >> >> > - shares = MIN_SHARES; >> > ... >> >> > return

Re: [PATCH] sched/fair: fix calc_cfs_shares fixed point arithmetics

2016-12-19 Thread Paul Turner
On Mon, Dec 19, 2016 at 3:07 PM, Samuel Thibault wrote: > Paul Turner, on Mon 19 Dec 2016 14:44:38 -0800, wrote: >> On Mon, Dec 19, 2016 at 2:40 PM, Samuel Thibault >> wrote: >> > 2159197d6677 ("sched/core: Enable increased load resolution on 64-bit >> > k

Re: [PATCH] sched/fair: fix calc_cfs_shares fixed point arithmetics

2016-12-19 Thread Paul Turner
On Mon, Dec 19, 2016 at 2:40 PM, Samuel Thibault wrote: > 2159197d6677 ("sched/core: Enable increased load resolution on 64-bit > kernels") > > exposed yet another miscalculation in calc_cfs_shares: MIN_SHARES is unscaled, > and must thus be scaled before being manipulated against "shares" amount

Re: [RFC PATCH v8 1/9] Restartable sequences system call

2016-11-26 Thread Paul Turner
; >> The restartable critical sections (percpu atomics) work has been started >> by Paul Turner and Andrew Hunter. It lets the kernel handle restart of >> critical sections. [1] [2] The re-implementation proposed here brings a >> few simplifications to the ABI which facilita

Re: [PATCH] sched/fair: Fix fixed point arithmetic width for shares and effective load

2016-08-23 Thread Paul Turner
On Mon, Aug 22, 2016 at 7:00 AM, Dietmar Eggemann wrote: > > Since commit 2159197d6677 ("sched/core: Enable increased load resolution > on 64-bit kernels") we now have two different fixed point units for > load. > shares in calc_cfs_shares() has 20 bit fixed point unit on 64-bit > kernels. Therefo

Re: [tip:locking/core] sched/wait: Fix signal handling in bit wait helpers

2015-12-11 Thread Paul Turner
eans that _wait_on_bit_lock can return -EINTR up to __lock_page; which does not validate the return code and blindly returns. This looks to have been a previously existing bug, but it was at least masked by the fact that it required a fatal signal previously (and that the page we return unlo

Re: [PATCH 2/4] sched: Document Program-Order guarantees

2015-11-02 Thread Paul Turner
On Mon, Nov 2, 2015 at 12:34 PM, Peter Zijlstra wrote: > On Mon, Nov 02, 2015 at 12:27:05PM -0800, Paul Turner wrote: >> I suspect this part might be more explicitly expressed by specifying >> the requirements that migration satisfies; then providing an example. >> This makes

Re: [PATCH] sched: Update task->on_rq when tasks are moving between runqueues

2015-11-02 Thread Paul Turner
On Wed, Oct 28, 2015 at 6:58 PM, Peter Zijlstra wrote: > On Wed, Oct 28, 2015 at 05:57:10PM -0700, Olav Haugan wrote: >> On 15-10-25 11:09:24, Peter Zijlstra wrote: >> > On Sat, Oct 24, 2015 at 11:01:02AM -0700, Olav Haugan wrote: >> > > Task->on_rq has three states: >> > > 0 - Task is not on ru

Re: [PATCH 2/4] sched: Document Program-Order guarantees

2015-11-02 Thread Paul Turner
On Mon, Nov 2, 2015 at 5:29 AM, Peter Zijlstra wrote: > These are some notes on the scheduler locking and how it provides > program order guarantees on SMP systems. > > Cc: Linus Torvalds > Cc: Will Deacon > Cc: Oleg Nesterov > Cc: Boqun Feng > Cc: "Paul E. McKenney" > Cc: Jonathan Corbet >

Re: [RFC PATCH v2 2/3] restartable sequences: x86 ABI

2015-10-27 Thread Paul Turner
On Tue, Oct 27, 2015 at 10:03 PM, Peter Zijlstra wrote: > > On Tue, Oct 27, 2015 at 04:57:05PM -0700, Paul Turner wrote: > > +static void rseq_sched_out(struct preempt_notifier *pn, > > +struct task_struct *next) > > +{ > > + set

[RFC PATCH v2 2/3] restartable sequences: x86 ABI

2015-10-27 Thread Paul Turner
From: Paul Turner Recall the general ABI is: The kernel ABI generally consists of: a) A shared TLS word which exports the current cpu and event-count b) A shared TLS word which, when non-zero, stores the first post-commit instruction if a sequence is active. (The kernel

[RFC PATCH v2 3/3] restartable sequences: basic self-tests

2015-10-27 Thread Paul Turner
rrupted by signals. "basic_percpu_ops_test" is a slightly more "realistic" variant, implementing a few simple per-cpu operations and testing their correctness. It also includes a trivial example of user-space may multiplexing the critical section via the restart handle

[RFC PATCH 0/3] restartable sequences v2: fast user-space percpu critical sections

2015-10-27 Thread Paul Turner
This is an update to the previously posted series at: https://lkml.org/lkml/2015/6/24/665 Dave Watson has posted a similar follow-up which allows additional critical regions to be registered as well as single-step support at: https://lkml.org/lkml/2015/10/22/588 This series is a new approach

[RFC PATCH v2 1/3] restartable sequences: user-space per-cpu critical sections

2015-10-27 Thread Paul Turner
From: Paul Turner Introduce the notion of a restartable sequence. This is a piece of user code that can be described in 3 components: 1) Establish where [e.g. which cpu] the thread is running 2) Preparatory work that is dependent on the state in [1]. 3) A committing instruction that

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-10-15 Thread Paul Turner
On Thu, Oct 1, 2015 at 11:46 AM, Tejun Heo wrote: > Hello, Paul. > > Sorry about the delay. Things were kinda hectic in the past couple > weeks. Likewise :-( > > On Fri, Sep 18, 2015 at 04:27:07AM -0700, Paul Turner wrote: >> On Sat, Sep 12, 2015 at 7:40 AM, Tejun Heo

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-09-18 Thread Paul Turner
On Sat, Sep 12, 2015 at 7:40 AM, Tejun Heo wrote: > Hello, > > On Wed, Sep 09, 2015 at 05:49:31AM -0700, Paul Turner wrote: >> I do not think this is a layering problem. This is more like C++: >> there is no sane way to concurrently use all the features available, >&g

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-09-09 Thread Paul Turner
24, 2015 at 04:06:39PM -0700, Paul Turner wrote: >> > This is an erratic behavior on cpuset's part tho. Nothing else >> > behaves this way and it's borderline buggy. >> >> It's actually the only sane possible interaction here. >> >> If you do

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 3:49 PM, Tejun Heo wrote: > Hello, > > On Mon, Aug 24, 2015 at 03:03:05PM -0700, Paul Turner wrote: >> > Hmm... I was hoping for an actual configurations and usage scenarios. >> > Preferably something people can set up and play with. >> >

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 3:19 PM, Tejun Heo wrote: > Hey, > > On Mon, Aug 24, 2015 at 02:58:23PM -0700, Paul Turner wrote: >> > Why isn't it? Because the programs themselves might try to override >> > it? >> >> The major reasons are: >> >>

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 2:40 PM, Tejun Heo wrote: > On Mon, Aug 24, 2015 at 02:19:29PM -0700, Paul Turner wrote: >> > Would it be possible for you to give realistic and concrete examples? >> > I'm not trying to play down the use cases but concrete examples are >&

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 2:36 PM, Tejun Heo wrote: > Hello, Paul. > > On Mon, Aug 24, 2015 at 01:52:01PM -0700, Paul Turner wrote: >> We typically share our machines between many jobs, these jobs can have >> cores that are "private" (and not shared with other jobs

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 2:17 PM, Tejun Heo wrote: > Hello, > > On Mon, Aug 24, 2015 at 02:10:17PM -0700, Paul Turner wrote: >> Suppose that we have 10 vcpu threads and 100 support threads. >> Suppose that we want the support threads to receive up to 10% of the >> ti

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 2:12 PM, Tejun Heo wrote: > Hello, Paul. > > On Mon, Aug 24, 2015 at 02:00:54PM -0700, Paul Turner wrote: >> > Hmmm... I'm trying to understand the usecases where having hierarchy >> > inside a process are actually required so that we d

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 2:02 PM, Tejun Heo wrote: > Hello, > > On Mon, Aug 24, 2015 at 01:54:08PM -0700, Paul Turner wrote: >> > That alone doesn't require hierarchical resource distribution tho. >> > Setting nice levels reasonably is likely to alleviate most of

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 1:25 PM, Tejun Heo wrote: > Hello, Austin. > > On Mon, Aug 24, 2015 at 04:00:49PM -0400, Austin S Hemmelgarn wrote: >> >That alone doesn't require hierarchical resource distribution tho. >> >Setting nice levels reasonably is likely to alleviate most of the >> >problem. >> >

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Mon, Aug 24, 2015 at 10:04 AM, Tejun Heo wrote: > Hello, Austin. > > On Mon, Aug 24, 2015 at 11:47:02AM -0400, Austin S Hemmelgarn wrote: >> >Just to learn more, what sort of hypervisor support threads are we >> >talking about? They would have to consume considerable amount of cpu >> >cycles f

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-24 Thread Paul Turner
On Sat, Aug 22, 2015 at 11:29 AM, Tejun Heo wrote: > Hello, Paul. > > On Fri, Aug 21, 2015 at 12:26:30PM -0700, Paul Turner wrote: > ... >> A very concrete example of the above is a virtual machine in which you >> want to guarantee scheduling for the vCPU threads which

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-21 Thread Paul Turner
On Tue, Aug 18, 2015 at 1:31 PM, Tejun Heo wrote: > Hello, Paul. > > On Mon, Aug 17, 2015 at 09:03:30PM -0700, Paul Turner wrote: >> > 2) Control within an address-space. For subsystems with fungible >> > resources, >> > e.g. CPU, it can be useful for a

Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

2015-08-17 Thread Paul Turner
Apologies for the repeat. Gmail ate its plain text setting for some reason. Shame bells. On Mon, Aug 17, 2015 at 9:02 PM, Paul Turner wrote: > > > On Wed, Aug 5, 2015 at 7:31 AM, Tejun Heo wrote: >> >> Hello, >> >> On Wed, Aug 05, 2015 at 11:10:36AM +0200,

Re: [RFC PATCH 2/3] restartable sequences: x86 ABI

2015-06-26 Thread Paul Turner
On Fri, Jun 26, 2015 at 12:31 PM, Andy Lutomirski wrote: > On Fri, Jun 26, 2015 at 11:09 AM, Mathieu Desnoyers > wrote: >> - On Jun 24, 2015, at 6:26 PM, Paul Turner p...@google.com wrote: >> >>> Implements the x86 (i386 & x86-64) ABIs for interrupting and

Re: [RFC PATCH 0/3] restartable sequences: fast user-space percpu critical sections

2015-06-25 Thread Paul Turner
On Thu, Jun 25, 2015 at 6:15 PM, Mathieu Desnoyers wrote: > - On Jun 24, 2015, at 10:54 PM, Paul Turner p...@google.com wrote: > >> On Wed, Jun 24, 2015 at 5:07 PM, Andy Lutomirski wrote: >>> On Wed, Jun 24, 2015 at 3:26 PM, Paul Turner wrote: >>>&

Re: [RFC PATCH 0/3] restartable sequences: fast user-space percpu critical sections

2015-06-24 Thread Paul Turner
On Wed, Jun 24, 2015 at 5:07 PM, Andy Lutomirski wrote: > On Wed, Jun 24, 2015 at 3:26 PM, Paul Turner wrote: >> This is a fairly small series demonstrating a feature we've found to be quite >> powerful in practice, "restartable sequences". >> > > On

[RFC PATCH 1/3] restartable sequences: user-space per-cpu critical sections

2015-06-24 Thread Paul Turner
tested in isolation. Signed-off-by: Paul Turner --- arch/Kconfig |7 + arch/x86/Kconfig |1 arch/x86/syscalls/syscall_64.tbl |1 fs/exec.c |1 include/linux/sched.h | 28 ++ include/uapi/asm-generi

[RFC PATCH 2/3] restartable sequences: x86 ABI

2015-06-24 Thread Paul Turner
t we always want the arguments to be available for sequence restart, it's much more natural to ultimately differentiate the ABI in these two cases. Signed-off-by: Paul Turner --- arch/x86/include/asm/restartable_sequences.h | 50 +++ arch/x86/kernel

[RFC PATCH 3/3] restartable sequences: basic user-space self-tests

2015-06-24 Thread Paul Turner
; is a slightly more "realistic" variant, implementing a few simple per-cpu operations and testing their correctness. It also includes a trivial example of user-space may multiplexing the critical section via the restart handler. Signed-off-by: Paul Turner --- tools/t

[RFC PATCH 0/3] restartable sequences: fast user-space percpu critical sections

2015-06-24 Thread Paul Turner
This is a fairly small series demonstrating a feature we've found to be quite powerful in practice, "restartable sequences". Most simply: these sequences comprise small snippets of user-code that are guaranteed to be (effectively) executed serially, with support for restart (or other handling) in

Re: [RFC PATCH] percpu system call: fast userspace percpu critical sections

2015-05-21 Thread Paul Turner
actually something we've strongly considered dropping. The complexity of correct TLS addressing is non-trivial. > > This approach is inspired by Paul Turner and Andrew Hunter's work > on percpu atomics, which lets the kernel handle restart of critical > sections, ref. &g

Re: [RFC PATCH] percpu system call: fast userspace percpu critical sections

2015-05-21 Thread Paul Turner
On Thu, May 21, 2015 at 12:08 PM, Mathieu Desnoyers wrote: > - Original Message - >> On Thu, May 21, 2015 at 10:44:47AM -0400, Mathieu Desnoyers wrote: >> >> > +struct thread_percpu_user { >> > + int32_t nesting; >> > + int32_t signal_sent; >> > + int32_t signo; >> > + int32_t curr

Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-04 Thread Paul Turner
On Thu, Sep 4, 2014 at 2:30 PM, John Stultz wrote: > On Thu, Sep 4, 2014 at 2:17 PM, Andrew Hunter wrote: >> On Wed, Sep 3, 2014 at 5:06 PM, John Stultz wrote: >>> Maybe with the next version of the patch, before you get into the >>> unwinding the math, you might practically describe what is bro

Re: [PATCH] sched: fix timeval conversion to jiffies

2014-09-03 Thread Paul Turner
size_t i = 0; i < 10; ++i) { > struct itimerval prev; > setitimer(ITIMER_PROF, &zero, &prev); > /* on old kernels, this goes up by TICK_USEC every iteration */ > printf("previous value: %ld %ld %ld %ld\n", >prev.it_interval.tv_sec, prev.it_inter

Re: [PATCH v2] sched: Reduce contention in update_cfs_rq_blocked_load

2014-08-26 Thread Paul Turner
On Tue, Aug 26, 2014 at 4:11 PM, Jason Low wrote: > Based on perf profiles, the update_cfs_rq_blocked_load function constantly > shows up as taking up a noticeable % of system run time. This is especially > apparent on larger numa systems. > > Much of the contention is in __update_cfs_rq_tg_load_c

Re: [PATCH] Revert "sched: Fix sleep time double accounting in enqueue entity"

2014-01-22 Thread Paul Turner
On Wed, Jan 22, 2014 at 9:53 AM, wrote: > Vincent Guittot writes: > >> This reverts commit 282cf499f03ec1754b6c8c945c9674b02631fb0f. >> >> With the current implementation, the load average statistics of a sched >> entity >> change according to other activity on the CPU even if this activity is

Re: [PATCH] sched: fix sched_entity avg statistics update

2014-01-21 Thread Paul Turner
On Tue, Jan 21, 2014 at 12:00 PM, Vincent Guittot wrote: > > Le 21 janv. 2014 19:39, a écrit : > > >> >> Vincent Guittot writes: >> >> > With the current implementation, the load average statistics of a sched >> > entity >> > change according to other activity on the CPU even if this activity is

Re: [PATCH 0/4] sched: remove cpu_load decay

2013-12-09 Thread Paul Turner
On Mon, Dec 9, 2013 at 5:04 PM, Alex Shi wrote: > On 12/03/2013 06:26 PM, Peter Zijlstra wrote: >> >> Paul, can you guys have a look at this, last time around you have a >> regression with this stuff, so it would be good to hear from you. >> > > Ping Paul. > Ben was looking at this right before t

Re: [PATCH 01/14] sched: add sched_class->task_dead.

2013-11-11 Thread Paul Turner
(*task_dead) (struct task_struct *p); > > void (*switched_from) (struct rq *this_rq, struct task_struct *task); > void (*switched_to) (struct rq *this_rq, struct task_struct *task); Reviewed-by: Paul Turner > -- > 1.7.9.5 > > -- > To unsubscribe from this lis

Re: [PATCH] [sched]: pick the NULL entity caused the panic.

2013-11-11 Thread Paul Turner
On Tue, Nov 12, 2013 at 8:29 AM, Wang, Xiaoming wrote: > cfs_rq get its group run queue but the value of > cfs_rq->nr_running maybe zero, which will cause > the panic in pick_next_task_fair. > So the evaluated of cfs_rq->nr_running is needed. > > [15729.985797] BUG: unable to handle kernel NULL po

[tip:sched/core] sched: Guarantee new group-entities always have weight

2013-10-29 Thread tip-bot for Paul Turner
Commit-ID: 0ac9b1c21874d2490331233b3242085f8151e166 Gitweb: http://git.kernel.org/tip/0ac9b1c21874d2490331233b3242085f8151e166 Author: Paul Turner AuthorDate: Wed, 16 Oct 2013 11:16:27 -0700 Committer: Ingo Molnar CommitDate: Tue, 29 Oct 2013 12:02:23 +0100 sched: Guarantee new group

Re: [PATCH 4/5] sched: Guarantee new group-entities always have weight

2013-10-16 Thread Paul Turner
On Wed, Oct 16, 2013 at 3:01 PM, Peter Zijlstra wrote: > On Wed, Oct 16, 2013 at 11:16:27AM -0700, Ben Segall wrote: >> From: Paul Turner >> >> Currently, group entity load-weights are initialized to zero. This >> admits some races with respect to the first time they a

Re: [tip:sched/core] sched/balancing: Fix cfs_rq-> task_h_load calculation

2013-09-30 Thread Paul Turner
On Mon, Sep 30, 2013 at 7:22 PM, Yuanhan Liu wrote: > On Mon, Sep 30, 2013 at 12:14:03PM +0400, Vladimir Davydov wrote: >> On 09/29/2013 01:47 PM, Yuanhan Liu wrote: >> >On Fri, Sep 20, 2013 at 06:46:59AM -0700, tip-bot for Vladimir Davydov >> >wrote: >> >>Commit-ID: 7e3115ef5149fc502e3a2e80719d

Re: [RFC][PATCH] sched: Avoid select_idle_sibling() for wake_affine(.sync=true)

2013-09-26 Thread Paul Turner
On Thu, Sep 26, 2013 at 4:16 AM, Peter Zijlstra wrote: > On Thu, Sep 26, 2013 at 03:55:55AM -0700, Paul Turner wrote: >> > + /* >> > +* Don't bother with select_idle_sibling() in the case of >> > a sync wakeup >> > +

Re: [RFC][PATCH] sched: Avoid select_idle_sibling() for wake_affine(.sync=true)

2013-09-26 Thread Paul Turner
On Thu, Sep 26, 2013 at 2:58 AM, Peter Zijlstra wrote: > On Wed, Sep 25, 2013 at 10:56:17AM +0200, Mike Galbraith wrote: >> That will make pipe-test go fugly -> pretty, and help very fast/light >> localhost network, but eat heavier localhost overlap recovery. We need >> a working (and cheap) over

Re: [PATCH] sched: Fix task_h_load calculation

2013-09-14 Thread Paul Turner
x27;v Once we've made it that e made it that > > if (!se) { > - cfs_rq->h_load = rq->avg.load_avg_contrib; > + cfs_rq->h_load = cfs_rq->runnable_load_avg; Looks good. Reviewed-by: Paul Turner > cfs_rq->last_h_load_update = now; >

Re: [PATCH 07/10] sched, fair: Optimize find_busiest_queue()

2013-08-27 Thread Paul Turner
On Mon, Aug 26, 2013 at 5:07 AM, Peter Zijlstra wrote: > On Sat, Aug 24, 2013 at 03:33:59AM -0700, Paul Turner wrote: >> On Mon, Aug 19, 2013 at 9:01 AM, Peter Zijlstra wrote: >> > +++ b/kernel/sched/fair.c >> > @@ -4977,7 +4977,7 @@ static struct rq *find_busiest_queu

Re: [PATCH 09/10] sched, fair: Fix the sd_parent_degenerate() code

2013-08-27 Thread Paul Turner
On Mon, Aug 26, 2013 at 2:49 PM, Rik van Riel wrote: > On 08/26/2013 08:09 AM, Peter Zijlstra wrote: >> On Sat, Aug 24, 2013 at 03:45:57AM -0700, Paul Turner wrote: >>>> @@ -5157,6 +5158,13 @@ cpu_attach_domain(struct sched_domain *s >>>>

Re: [PATCH 03/10] sched: Clean-up struct sd_lb_stat

2013-08-25 Thread Paul Turner
On Sun, Aug 25, 2013 at 7:56 PM, Lei Wen wrote: > On Tue, Aug 20, 2013 at 12:01 AM, Peter Zijlstra wrote: >> From: Joonsoo Kim >> >> There is no reason to maintain separate variables for this_group >> and busiest_group in sd_lb_stat, except saving some space. >> But this structure is always allo

Re: [PATCH 09/10] sched, fair: Fix the sd_parent_degenerate() code

2013-08-24 Thread Paul Turner
BLING down in case of a > +* degenerate parent; the spans match for this > +* so the property transfers. > +*/ > + if (parent->flags & SD_PREFER_SIBLING) > +

Re: [PATCH 07/10] sched, fair: Optimize find_busiest_queue()

2013-08-24 Thread Paul Turner
load = wl; max_load_power = power; ... This would actually end up being a little more accurate even. [ Alternatively without caching max_load_power we could compare wl * power vs max_load * SCHED_POWER_SCALE. ] Reviewed-by: Paul Turner -- To unsubscribe from this list: send the line "unsubscribe lin

Re: [PATCH 04/10] sched, fair: Shrink sg_lb_stats and play memset games

2013-08-24 Thread Paul Turner
On Mon, Aug 19, 2013 at 9:01 AM, Peter Zijlstra wrote: > We can shrink sg_lb_stats because rq::nr_running is an 'unsigned int' > and cpu numbers are 'int' > > Before: > sgs:/* size: 72, cachelines: 2, members: 10 */ > sds:/* size: 184, cachelines: 3, members: 7 */ > > After: >

Re: [PATCH 03/10] sched: Clean-up struct sd_lb_stat

2013-08-24 Thread Paul Turner
if (env->idle == CPU_NEWLY_IDLE && sds.this_has_capacity && > - !sds.busiest_has_capacity) > + if (env->idle == CPU_NEWLY_IDLE && this->group_has_capacity && > + !busiest->group_has_capacity) >

Re: [PATCH 02/10] sched: Factor out code to should_we_balance()

2013-08-23 Thread Paul Turner
On Thu, Aug 22, 2013 at 3:42 AM, Peter Zijlstra wrote: > On Thu, Aug 22, 2013 at 02:58:27AM -0700, Paul Turner wrote: >> On Mon, Aug 19, 2013 at 9:01 AM, Peter Zijlstra wrote: > >> > + if (local_group) >> > load = target_loa

Re: [PATCH 02/10] sched: Factor out code to should_we_balance()

2013-08-22 Thread Paul Turner
int balance = 1; > + int should_balance = 1; > struct rq *rq = cpu_rq(cpu); > unsigned long interval; > struct sched_domain *sd; > @@ -5618,7 +5615,7 @@ static void rebalance_domains(int cpu, e > } > > if (time_after_eq(jiffies, sd->last_balance + interval)) { > - if (load_balance(cpu, rq, sd, idle, &balance)) { > + if (load_balance(cpu, rq, sd, idle, &should_balance)) > { > /* > * The LBF_SOME_PINNED logic could have > changed > * env->dst_cpu, so we can't know our idle > @@ -5641,7 +5638,7 @@ static void rebalance_domains(int cpu, e > * CPU in our sched group which is doing load balancing more > * actively. > */ > - if (!balance) > + if (!should_balance) > break; > } > rcu_read_unlock(); > > Reviewed-by: Paul Turner -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 01/10] sched: Remove one division operation in find_busiest_queue()

2013-08-22 Thread Paul Turner
On Mon, Aug 19, 2013 at 9:00 AM, Peter Zijlstra wrote: > From: Joonsoo Kim > > Remove one division operation in find_busiest_queue() by using > crosswise multiplication: > > wl_i / power_i > wl_j / power_j := > wl_i * power_j > wl_j * power_i > > Signed-off-by: Joonsoo Kim > [pet

Re: false nr_running check in load balance?

2013-08-15 Thread Paul Turner
On Thu, Aug 15, 2013 at 10:39 AM, Peter Zijlstra wrote: > On Tue, Aug 13, 2013 at 01:08:17AM -0700, Paul Turner wrote: >> On Tue, Aug 13, 2013 at 12:38 AM, Peter Zijlstra >> wrote: >> > On Tue, Aug 13, 2013 at 12:45:12PM +0800, Lei Wen wrote: >> >> > No

Re: false nr_running check in load balance?

2013-08-13 Thread Paul Turner
On Tue, Aug 13, 2013 at 1:18 AM, Lei Wen wrote: > Hi Paul, > > On Tue, Aug 13, 2013 at 4:08 PM, Paul Turner wrote: >> On Tue, Aug 13, 2013 at 12:38 AM, Peter Zijlstra >> wrote: >>> On Tue, Aug 13, 2013 at 12:45:12PM +0800, Lei Wen wrote: >>>> &

Re: false nr_running check in load balance?

2013-08-13 Thread Paul Turner
On Tue, Aug 13, 2013 at 12:38 AM, Peter Zijlstra wrote: > On Tue, Aug 13, 2013 at 12:45:12PM +0800, Lei Wen wrote: >> > Not quite right; I think you need busiest->cfs.h_nr_running. >> > cfs.nr_running is the number of entries running in this 'group'. If >> > you've got nested groups like: >> > >>

Re: [PATCH] sched,x86: optimize switch_mm for multi-threaded workloads

2013-07-31 Thread Paul Turner
We attached the following explanatory comment to our version of the patch: /* * In the common case (two user threads sharing mm * switching) the bit will be set; avoid doing a write * (via atomic test & set) unless we have to. This is * safe, because no other CPU ever writes to our bit * in the m

Re: [PATCH] sched,x86: optimize switch_mm for multi-threaded workloads

2013-07-31 Thread Paul Turner
cpu, mm_cpumask(next)); > load_cr3(next->pgd); > load_LDT_nolock(&next->context); > } We're carrying the *exact* same patch for *exact* same reason. I've been meaning to send it out but wasn't sure of a goo

Re: PROBLEM: Persistent unfair sharing of a processor by auto groups in 3.11-rc2 (has twice regressed)

2013-07-26 Thread Paul Turner
On Fri, Jul 26, 2013 at 2:50 PM, Peter Zijlstra wrote: > On Fri, Jul 26, 2013 at 02:24:50PM -0700, Paul Turner wrote: >> On Fri, Jul 26, 2013 at 2:03 PM, Peter Zijlstra wrote: >> > >> > >> > OK, so I have the below; however on a second look, Paul, shouldn

Re: PROBLEM: Persistent unfair sharing of a processor by auto groups in 3.11-rc2 (has twice regressed)

2013-07-26 Thread Paul Turner
augmenting > periodic update that was supposed to account for this; resulting in a > potential loss of fairness. > > To fix this, re-introduce the explicit update in > update_cfs_rq_blocked_load() [called via entity_tick()]. > > Cc: sta...@kernel.org > Reviewed-by: Paul Tur

  1   2   3   >