Re: Softirq priority inversion from "softirq: reduce latencies"

2016-03-07 Thread Mike Galbraith
On Mon, 2016-03-07 at 16:31 +0100, Sebastian Andrzej Siewior wrote: > On 02/29/2016 05:58 AM, Mike Galbraith wrote: > > WRT -rt: if dma tasklets really do have hard (ish) constraints, -rt > > recently "broke" in the same way.. of all softirqs which are deferred > > to kthread context, due to a rece

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-03-07 Thread Sebastian Andrzej Siewior
On 02/27/2016 07:19 PM, Peter Hurley wrote: > Hi Eric, Hi Peter, > Because both the uart driver (omap8250) and the dmaengine driver > (edma) were (relatively) new, we assumed there was some race between > starting a new rx DMA and processing the previous one. Now after digesting the whole thread

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-03-07 Thread Sebastian Andrzej Siewior
On 02/29/2016 05:58 AM, Mike Galbraith wrote: > WRT -rt: if dma tasklets really do have hard (ish) constraints, -rt > recently "broke" in the same way.. of all softirqs which are deferred > to kthread context, due to a recent change, only timer/hrtimer are > executed at realtime priority by default

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 11:14 AM, Thomas Gleixner wrote: > On Mon, 29 Feb 2016, Peter Hurley wrote: >> On 02/29/2016 10:24 AM, Eric Dumazet wrote: Just to be clear if (time_before(jiffies, end) && !need_resched() && --max_restart) goto res

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread David Miller
From: Thomas Gleixner Date: Mon, 29 Feb 2016 20:14:36 +0100 (CET) > On Mon, 29 Feb 2016, Peter Hurley wrote: >> Or flipping your argument on its head, why not just _always_ execute >> softirq in ksoftirqd? > > Which is what that change effectivley does. And that makes a lot of sense, > because y

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Eric Dumazet
On lun., 2016-02-29 at 11:13 -0800, Peter Hurley wrote: > On 02/29/2016 07:27 AM, Eric Dumazet wrote: > > On lun., 2016-02-29 at 07:03 -0800, Peter Hurley wrote: > > > >> The reason why Eric's change is so effective for Eric's workload is > >> that it fixes the problem where NET_RX keeps getting n

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Thomas Gleixner
On Mon, 29 Feb 2016, Peter Hurley wrote: > On 02/29/2016 10:24 AM, Eric Dumazet wrote: > >> Just to be clear > >> > >>if (time_before(jiffies, end) && !need_resched() && > >>--max_restart) > >>goto restart; > >> > >> aborts softirq *even if 0ns have e

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 07:27 AM, Eric Dumazet wrote: > On lun., 2016-02-29 at 07:03 -0800, Peter Hurley wrote: > >> The reason why Eric's change is so effective for Eric's workload is >> that it fixes the problem where NET_RX keeps getting new network packets >> so it keeps looping, servicing more NET_RX s

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 10:24 AM, Eric Dumazet wrote: > On lun., 2016-02-29 at 10:05 -0800, Peter Hurley wrote: > >> While I appreciate the attempt, that's not the problem. >> >> Just to be clear >> >> if (time_before(jiffies, end) && !need_resched() && >> --max_restart) >>

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Eric Dumazet
On lun., 2016-02-29 at 10:05 -0800, Peter Hurley wrote: > While I appreciate the attempt, that's not the problem. > > Just to be clear > > if (time_before(jiffies, end) && !need_resched() && > --max_restart) > goto restart; > > aborts softir

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 08:21 AM, Eric Dumazet wrote: > On lun., 2016-02-29 at 07:54 -0800, Peter Hurley wrote: > >> The current kernel is HZ=250 but this would occur on HZ=1000 as well. > > Right. But the problem with HZ=100 and HZ=250 is that the detection can > happens because jiffy granularity is too

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread David Miller
From: Peter Hurley Date: Mon, 29 Feb 2016 07:03:11 -0800 > However, I'm pointing out that Eric's sledgehammer approach to fixing > the NET_RX softirq bug is having significant side-effects in other > subsystems. Either your hardware can handle arbitrary latencies and thus can use softirqs for ev

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Eric Dumazet
On lun., 2016-02-29 at 07:58 -0800, Peter Hurley wrote: > All that's happened is the first loop of NET_RX softirq has woken a > process; that is sufficient to abort softirq and defer it for ksoftirqd. > > That's why I'm saying this is a priority inversion, and one that > will happen a lot. Sure.

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Eric Dumazet
On lun., 2016-02-29 at 07:54 -0800, Peter Hurley wrote: > The current kernel is HZ=250 but this would occur on HZ=1000 as well. Right. But the problem with HZ=100 and HZ=250 is that the detection can happens because jiffy granularity is too coarse, since msecs_to_jiffies(2) -> 1 Following pat

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 07:19 AM, Eric Dumazet wrote: > On lun., 2016-02-29 at 07:03 -0800, Peter Hurley wrote: > >> Not the case. The softirq is raised from interrupt. >> >> Before Eric's change, when an interrupt raises a new softirq >> while processing another softirq, the new softirq is immediately >> p

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/29/2016 07:40 AM, Mike Galbraith wrote: > On Mon, 2016-02-29 at 07:03 -0800, Peter Hurley wrote: > >>> If I'm listening properly, the root cause is that there is a timing >>> constraint involved, which is being exposed because one softirq raises >>> another (ew). >> >> Not the case. The soft

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Mike Galbraith
On Mon, 2016-02-29 at 07:03 -0800, Peter Hurley wrote: > > If I'm listening properly, the root cause is that there is a timing > > constraint involved, which is being exposed because one softirq raises > > another (ew). > > Not the case. The softirq is raised from interrupt. Yeah, saw that on re

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Eric Dumazet
On lun., 2016-02-29 at 07:03 -0800, Peter Hurley wrote: > The reason why Eric's change is so effective for Eric's workload is > that it fixes the problem where NET_RX keeps getting new network packets > so it keeps looping, servicing more NET_RX softirq. You have very little idea of what is happe

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Eric Dumazet
On lun., 2016-02-29 at 07:03 -0800, Peter Hurley wrote: > Not the case. The softirq is raised from interrupt. > > Before Eric's change, when an interrupt raises a new softirq > while processing another softirq, the new softirq is immediately > processed *after the existing softirq completes*. >

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-29 Thread Peter Hurley
On 02/28/2016 08:58 PM, Mike Galbraith wrote: > On Sun, 2016-02-28 at 18:01 +0100, Francois Romieu wrote: >> Mike Galbraith : >> [...] >>> Hrm, relatively new + tasklet woes rings a bell. Ah, that.. >>> >>> >>> What's worse is that at the point where this code was written it was >>> already well

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-28 Thread Mike Galbraith
On Sun, 2016-02-28 at 18:01 +0100, Francois Romieu wrote: > Mike Galbraith : > [...] > > Hrm, relatively new + tasklet woes rings a bell. Ah, that.. > > > > > > What's worse is that at the point where this code was written it was > > already well known that tasklets are a steaming pile of crap

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-28 Thread Francois Romieu
Mike Galbraith : [...] > Hrm, relatively new + tasklet woes rings a bell. Ah, that.. > > > What's worse is that at the point where this code was written it was > already well known that tasklets are a steaming pile of crap and > should die. > > > Source thereof https://lwn.net/Articles/588457

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Mike Galbraith
On Sat, 2016-02-27 at 10:19 -0800, Peter Hurley wrote: > Hi Eric, > > For a while now, we've been struggling to understand why we've been > observing missed uart rx DMA. > > Because both the uart driver (omap8250) and the dmaengine driver > (edma) were (relatively) new, we assumed there was some

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread David Miller
From: Peter Hurley Date: Sat, 27 Feb 2016 18:10:27 -0800 > That tasklet should run before any process. You never have this guarantee, even before Eric's patch. Under load tasklets run from ksoftirqd just like any other softirq. Please fix your driver and stop blaming Eric's change. Thank you.

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Eric Dumazet
On sam., 2016-02-27 at 18:10 -0800, Peter Hurley wrote: > On 02/27/2016 05:59 PM, Eric Dumazet wrote: > > On sam., 2016-02-27 at 15:33 -0800, Peter Hurley wrote: > >> On 02/27/2016 03:04 PM, David Miller wrote: > >>> From: Peter Hurley > >>> Date: Sat, 27 Feb 2016 12:29:39 -0800 > >>> > Not r

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Peter Hurley
On 02/27/2016 05:59 PM, Eric Dumazet wrote: > On sam., 2016-02-27 at 15:33 -0800, Peter Hurley wrote: >> On 02/27/2016 03:04 PM, David Miller wrote: >>> From: Peter Hurley >>> Date: Sat, 27 Feb 2016 12:29:39 -0800 >>> Not really. softirq raised from interrupt context will always execute

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Eric Dumazet
On sam., 2016-02-27 at 15:33 -0800, Peter Hurley wrote: > On 02/27/2016 03:04 PM, David Miller wrote: > > From: Peter Hurley > > Date: Sat, 27 Feb 2016 12:29:39 -0800 > > > >> Not really. softirq raised from interrupt context will always execute > >> on this cpu and not in ksoftirqd, unless load

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Peter Hurley
On 02/27/2016 03:04 PM, David Miller wrote: > From: Peter Hurley > Date: Sat, 27 Feb 2016 12:29:39 -0800 > >> Not really. softirq raised from interrupt context will always execute >> on this cpu and not in ksoftirqd, unless load forces softirq loop abort. > > That guarantee never was specified.

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread David Miller
From: Peter Hurley Date: Sat, 27 Feb 2016 12:29:39 -0800 > Not really. softirq raised from interrupt context will always execute > on this cpu and not in ksoftirqd, unless load forces softirq loop abort. That guarantee never was specified. Or are you saying that by design, on a system under loa

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Peter Hurley
On 02/27/2016 12:13 PM, Eric Dumazet wrote: > On sam., 2016-02-27 at 10:19 -0800, Peter Hurley wrote: >> Hi Eric, >> >> For a while now, we've been struggling to understand why we've been >> observing missed uart rx DMA. >> >> Because both the uart driver (omap8250) and the dmaengine driver >> (edm

Re: Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Eric Dumazet
On sam., 2016-02-27 at 10:19 -0800, Peter Hurley wrote: > Hi Eric, > > For a while now, we've been struggling to understand why we've been > observing missed uart rx DMA. > > Because both the uart driver (omap8250) and the dmaengine driver > (edma) were (relatively) new, we assumed there was some

Softirq priority inversion from "softirq: reduce latencies"

2016-02-27 Thread Peter Hurley
Hi Eric, For a while now, we've been struggling to understand why we've been observing missed uart rx DMA. Because both the uart driver (omap8250) and the dmaengine driver (edma) were (relatively) new, we assumed there was some race between starting a new rx DMA and processing the previous one.