>Thanks for the clarification.
>
>The problem with what Ming is proposing in my mind (and its an existing
>problem that exists today), is that nvme is taking precedence over anything
>else until it absolutely cannot hog the cpu in hardirq.
>
>In the thread Ming referenced a case where today if the
Sagi,
Sorry it took a while to bring my system back online.
With the patch, the IOPS is about the same drop with the 1st patch. I think
the excessive context switches are causing the drop in IOPS.
The following are captured by "perf sched record" for 30 seconds during
tests.
"perf sched
> >> Long, does this patch make any difference?
> >
> > Sagi,
> >
> > Sorry it took a while to bring my system back online.
> >
> > With the patch, the IOPS is about the same drop with the 1st patch. I think
> the excessive context switches are causing the drop in IOPS.
> >
> > The following are ca
Hey Ming,
Ok, so the real problem is per-cpu bounded tasks.
I share Thomas opinion about a NAPI like approach.
We already have that, its irq_poll, but it seems that for this
use-case, we get lower performance for some reason. I'm not entirely
sure why that is, maybe its because we need to
It seems like we're attempting to stay in irq context for as long as we
can instead of scheduling to softirq/thread context if we have more than
a minimal amount of work to do. Without at least understanding why
softirq/thread degrades us so much this code seems like the wrong
approach to me. I
On Mon, Sep 09, 2019 at 08:10:07PM -0700, Sagi Grimberg wrote:
> Hey Ming,
>
> > > > Ok, so the real problem is per-cpu bounded tasks.
> > > >
> > > > I share Thomas opinion about a NAPI like approach.
> > >
> > > We already have that, its irq_poll, but it seems that for this
> > > use-case, we
>Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
>
>Hey Ming,
>
>>>> Ok, so the real problem is per-cpu bounded tasks.
>>>>
>>>> I share Thomas opinion about a NAPI like approach.
>>>
>>> We already have t
Hey Ming,
Ok, so the real problem is per-cpu bounded tasks.
I share Thomas opinion about a NAPI like approach.
We already have that, its irq_poll, but it seems that for this
use-case, we get lower performance for some reason. I'm not
entirely sure why that is, maybe its because we need to mas
On Sat, Sep 07, 2019 at 06:19:20AM +0800, Ming Lei wrote:
> On Fri, Sep 06, 2019 at 05:50:49PM +, Long Li wrote:
> > >Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
> > >
> > >On Fri, Sep 06, 2019 at 09:48:21AM +0800, Ming Lei wrote:
>
On Fri, Sep 06, 2019 at 11:30:57AM -0700, Sagi Grimberg wrote:
>
> >
> > Ok, so the real problem is per-cpu bounded tasks.
> >
> > I share Thomas opinion about a NAPI like approach.
>
> We already have that, its irq_poll, but it seems that for this
> use-case, we get lower performance for some
On Fri, Sep 06, 2019 at 04:25:55PM -0600, Keith Busch wrote:
> On Sat, Sep 07, 2019 at 06:19:21AM +0800, Ming Lei wrote:
> > On Fri, Sep 06, 2019 at 05:50:49PM +, Long Li wrote:
> > > >Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
> > >
On Sat, Sep 07, 2019 at 06:19:21AM +0800, Ming Lei wrote:
> On Fri, Sep 06, 2019 at 05:50:49PM +, Long Li wrote:
> > >Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
> > >
> > >Why are all 8 nvmes sharing the same CPU for inte
On Fri, Sep 06, 2019 at 05:50:49PM +, Long Li wrote:
> >Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
> >
> >On Fri, Sep 06, 2019 at 09:48:21AM +0800, Ming Lei wrote:
> >> When one IRQ flood happens on one CPU:
> >>
> >&g
On Fri, Sep 06, 2019 at 11:30:57AM -0700, Sagi Grimberg wrote:
>
> >
> > Ok, so the real problem is per-cpu bounded tasks.
> >
> > I share Thomas opinion about a NAPI like approach.
>
> We already have that, its irq_poll, but it seems that for this
> use-case, we get lower performance for some
Ok, so the real problem is per-cpu bounded tasks.
I share Thomas opinion about a NAPI like approach.
We already have that, its irq_poll, but it seems that for this
use-case, we get lower performance for some reason. I'm not
entirely sure why that is, maybe its because we need to mask interr
>Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
>
>On Fri, Sep 06, 2019 at 09:48:21AM +0800, Ming Lei wrote:
>> When one IRQ flood happens on one CPU:
>>
>> 1) softirq handling on this CPU can't make progress
>>
>> 2) kernel
On Fri, Sep 06, 2019 at 09:48:21AM +0800, Ming Lei wrote:
> When one IRQ flood happens on one CPU:
>
> 1) softirq handling on this CPU can't make progress
>
> 2) kernel thread bound to this CPU can't make progress
>
> For example, network may require softirq to xmit packets, or another irq
> thr
Hi,
On 06/09/2019 03:48, Ming Lei wrote:
[ ... ]
>> You did not share yet the analysis of the problem (the kernel warnings
>> give the symptoms) and gave the reasoning for the solution. It is hard
>> to understand what you are looking for exactly and how to connect the dots.
>
> Let me explai
>Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
>
>
>On 06/09/2019 03:22, Long Li wrote:
>[ ... ]
>>
>
>> Tracing shows that the CPU was in either hardirq or softirq all the
>> time before warnings. During tests, the system was un
On 06/09/2019 03:22, Long Li wrote:
[ ... ]
>
> Tracing shows that the CPU was in either hardirq or softirq all the
> time before warnings. During tests, the system was unresponsive at
> times.
>
> Ming's patch fixed this problem. The system was responsive throughout
> tests.
>
> As for perfo
Hi Daniel,
On Thu, Sep 05, 2019 at 12:37:13PM +0200, Daniel Lezcano wrote:
>
> Hi Ming,
>
> On 05/09/2019 11:06, Ming Lei wrote:
> > On Wed, Sep 04, 2019 at 07:31:48PM +0200, Daniel Lezcano wrote:
> >> Hi,
> >>
> >> On 04/09/2019 19:07, Bart Van Assche wrote:
> >>> On 9/3/19 12:50 AM, Daniel Lez
>Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism
>
>
>Hi Ming,
>
>On 05/09/2019 11:06, Ming Lei wrote:
>> On Wed, Sep 04, 2019 at 07:31:48PM +0200, Daniel Lezcano wrote:
>>> Hi,
>>>
>>> On 04/09/2019 19:07, Bart Van As
Hi Ming,
On 05/09/2019 11:06, Ming Lei wrote:
> On Wed, Sep 04, 2019 at 07:31:48PM +0200, Daniel Lezcano wrote:
>> Hi,
>>
>> On 04/09/2019 19:07, Bart Van Assche wrote:
>>> On 9/3/19 12:50 AM, Daniel Lezcano wrote:
On 03/09/2019 09:28, Ming Lei wrote:
> On Tue, Sep 03, 2019 at 08:40:35A
On Wed, Sep 04, 2019 at 12:47:13PM -0700, Bart Van Assche wrote:
> On 9/4/19 11:02 AM, Peter Zijlstra wrote:
> > On Wed, Sep 04, 2019 at 10:38:59AM -0700, Bart Van Assche wrote:
> > > I think it is widely known that rdtsc is a relatively slow x86
> > > instruction.
> > > So I expect that using tha
On Wed, Sep 04, 2019 at 07:31:48PM +0200, Daniel Lezcano wrote:
> Hi,
>
> On 04/09/2019 19:07, Bart Van Assche wrote:
> > On 9/3/19 12:50 AM, Daniel Lezcano wrote:
> >> On 03/09/2019 09:28, Ming Lei wrote:
> >>> On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
> It is a schedul
On 9/4/19 11:02 AM, Peter Zijlstra wrote:
On Wed, Sep 04, 2019 at 10:38:59AM -0700, Bart Van Assche wrote:
I think it is widely known that rdtsc is a relatively slow x86 instruction.
So I expect that using that instruction will cause a measurable overhead if
it is called frequently enough. I'm n
On Wed, Sep 04, 2019 at 10:38:59AM -0700, Bart Van Assche wrote:
> On 9/4/19 10:31 AM, Daniel Lezcano wrote:
> > On 04/09/2019 19:07, Bart Van Assche wrote:
> > > Only if CONFIG_IRQ_TIME_ACCOUNTING has been enabled. However, I don't
> > > know any Linux distro that enables that option. That's proba
On 9/4/19 10:31 AM, Daniel Lezcano wrote:
On 04/09/2019 19:07, Bart Van Assche wrote:
Only if CONFIG_IRQ_TIME_ACCOUNTING has been enabled. However, I don't
know any Linux distro that enables that option. That's probably because
that option introduces two rdtsc() calls in each interrupt. Given th
Hi,
On 04/09/2019 19:07, Bart Van Assche wrote:
> On 9/3/19 12:50 AM, Daniel Lezcano wrote:
>> On 03/09/2019 09:28, Ming Lei wrote:
>>> On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
It is a scheduler problem then ?
>>>
>>> Scheduler can do nothing if the CPU is taken complet
On 9/3/19 12:50 AM, Daniel Lezcano wrote:
On 03/09/2019 09:28, Ming Lei wrote:
On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
It is a scheduler problem then ?
Scheduler can do nothing if the CPU is taken completely by handling
interrupt & softirq, so seems not a scheduler pro
On Tue, Sep 03, 2019 at 09:50:06AM +0200, Daniel Lezcano wrote:
> On 03/09/2019 09:28, Ming Lei wrote:
> > On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
> >> On 03/09/2019 08:31, Ming Lei wrote:
> >>> Hi Daniel,
> >>>
> >>> On Tue, Sep 03, 2019 at 07:59:39AM +0200, Daniel Lezcano
On Tue, Sep 03, 2019 at 10:09:57AM +0200, Thomas Gleixner wrote:
> On Tue, 3 Sep 2019, Ming Lei wrote:
> > Scheduler can do nothing if the CPU is taken completely by handling
> > interrupt & softirq, so seems not a scheduler problem, IMO.
>
> Well, but thinking more about it, the solution you are
On Tue, 3 Sep 2019, Ming Lei wrote:
> Scheduler can do nothing if the CPU is taken completely by handling
> interrupt & softirq, so seems not a scheduler problem, IMO.
Well, but thinking more about it, the solution you are proposing is more a
bandaid than anything else.
If you look at the network
On 03/09/2019 09:28, Ming Lei wrote:
> On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
>> On 03/09/2019 08:31, Ming Lei wrote:
>>> Hi Daniel,
>>>
>>> On Tue, Sep 03, 2019 at 07:59:39AM +0200, Daniel Lezcano wrote:
Hi Ming Lei,
On 03/09/2019 05:30, Ming Lei wrote:
On Tue, Sep 03, 2019 at 08:40:35AM +0200, Daniel Lezcano wrote:
> On 03/09/2019 08:31, Ming Lei wrote:
> > Hi Daniel,
> >
> > On Tue, Sep 03, 2019 at 07:59:39AM +0200, Daniel Lezcano wrote:
> >>
> >> Hi Ming Lei,
> >>
> >> On 03/09/2019 05:30, Ming Lei wrote:
> >>
> >> [ ... ]
> >>
> >>
> > 2)
On 03/09/2019 08:31, Ming Lei wrote:
> Hi Daniel,
>
> On Tue, Sep 03, 2019 at 07:59:39AM +0200, Daniel Lezcano wrote:
>>
>> Hi Ming Lei,
>>
>> On 03/09/2019 05:30, Ming Lei wrote:
>>
>> [ ... ]
>>
>>
> 2) irq/timing doesn't cover softirq
That's solvable, right?
>>>
>>> Yeah, we can e
Hi Daniel,
On Tue, Sep 03, 2019 at 07:59:39AM +0200, Daniel Lezcano wrote:
>
> Hi Ming Lei,
>
> On 03/09/2019 05:30, Ming Lei wrote:
>
> [ ... ]
>
>
> >>> 2) irq/timing doesn't cover softirq
> >>
> >> That's solvable, right?
> >
> > Yeah, we can extend irq/timing, but ugly for irq/timing, si
Hi Ming Lei,
On 03/09/2019 05:30, Ming Lei wrote:
[ ... ]
>>> 2) irq/timing doesn't cover softirq
>>
>> That's solvable, right?
>
> Yeah, we can extend irq/timing, but ugly for irq/timing, since irq/timing
> focuses on hardirq predication, and softirq isn't involved in that
> purpose.
>
>>
On Wed, Aug 28, 2019 at 04:07:19PM +0200, Thomas Gleixner wrote:
> On Wed, 28 Aug 2019, Ming Lei wrote:
> > On Wed, Aug 28, 2019 at 01:23:06PM +0200, Thomas Gleixner wrote:
> > > On Wed, 28 Aug 2019, Ming Lei wrote:
> > > > On Wed, Aug 28, 2019 at 01:09:44AM +0200, Thomas Gleixner wrote:
> > > > >
On Thu, Aug 29, 2019 at 06:15:00AM +, Long Li wrote:
> >>>For some high performance IO devices, interrupt may come very frequently,
> >>>meantime IO request completion may take a bit time. Especially on some
> >>>devices(SCSI or NVMe), IO requests can be submitted concurrently from
> >>>multipl
>>>For some high performance IO devices, interrupt may come very frequently,
>>>meantime IO request completion may take a bit time. Especially on some
>>>devices(SCSI or NVMe), IO requests can be submitted concurrently from
>>>multiple CPU cores, however IO completion is only done on one of these
>
On Wed, 28 Aug 2019, Ming Lei wrote:
> On Wed, Aug 28, 2019 at 01:23:06PM +0200, Thomas Gleixner wrote:
> > On Wed, 28 Aug 2019, Ming Lei wrote:
> > > On Wed, Aug 28, 2019 at 01:09:44AM +0200, Thomas Gleixner wrote:
> > > > > > Also how is that supposed to work when sched_clock is jiffies based?
>
On Wed, Aug 28, 2019 at 01:23:06PM +0200, Thomas Gleixner wrote:
> On Wed, 28 Aug 2019, Ming Lei wrote:
> > On Wed, Aug 28, 2019 at 01:09:44AM +0200, Thomas Gleixner wrote:
> > > > > Also how is that supposed to work when sched_clock is jiffies based?
> > > >
> > > > Good catch, looks ktime_get_ns
On Wed, 28 Aug 2019, Ming Lei wrote:
> On Wed, Aug 28, 2019 at 01:09:44AM +0200, Thomas Gleixner wrote:
> > > > Also how is that supposed to work when sched_clock is jiffies based?
> > >
> > > Good catch, looks ktime_get_ns() is needed.
> >
> > And what is ktime_get_ns() returning when the only a
On Wed, Aug 28, 2019 at 01:09:44AM +0200, Thomas Gleixner wrote:
> On Wed, 28 Aug 2019, Ming Lei wrote:
> > On Tue, Aug 27, 2019 at 04:42:02PM +0200, Thomas Gleixner wrote:
> > > On Tue, 27 Aug 2019, Ming Lei wrote:
> > > > +
> > > > + int cpu = raw_smp_processor_id();
> > > > + struct
On Wed, 28 Aug 2019, Ming Lei wrote:
> On Tue, Aug 27, 2019 at 06:19:00PM +0200, Thomas Gleixner wrote:
> > > We definitely are not going to have a 64bit multiplication and division on
> > > every interrupt. Asided of that this breaks 32bit builds all over the
> > > place.
> >
> > That said, we a
On Wed, 28 Aug 2019, Ming Lei wrote:
> On Tue, Aug 27, 2019 at 04:42:02PM +0200, Thomas Gleixner wrote:
> > On Tue, 27 Aug 2019, Ming Lei wrote:
> > > +
> > > + int cpu = raw_smp_processor_id();
> > > + struct irq_interval *inter = per_cpu_ptr(&avg_irq_interval, cpu);
> > > + u64 delta = sched_cloc
On Tue, Aug 27, 2019 at 06:19:00PM +0200, Thomas Gleixner wrote:
> On Tue, 27 Aug 2019, Thomas Gleixner wrote:
> > On Tue, 27 Aug 2019, Ming Lei wrote:
> > > +/*
> > > + * Update average irq interval with the Exponential Weighted Moving
> > > + * Average(EWMA)
> > > + */
> > > +static void irq_upda
On Tue, Aug 27, 2019 at 04:42:02PM +0200, Thomas Gleixner wrote:
> On Tue, 27 Aug 2019, Ming Lei wrote:
> > +/*
> > + * Update average irq interval with the Exponential Weighted Moving
> > + * Average(EWMA)
> > + */
> > +static void irq_update_interval(void)
> > +{
> > +#define IRQ_INTERVAL_EWMA_WE
On Tue, 27 Aug 2019, Thomas Gleixner wrote:
> On Tue, 27 Aug 2019, Ming Lei wrote:
> > +/*
> > + * Update average irq interval with the Exponential Weighted Moving
> > + * Average(EWMA)
> > + */
> > +static void irq_update_interval(void)
> > +{
> > +#define IRQ_INTERVAL_EWMA_WEIGHT 128
> > +#defi
On Tue, 27 Aug 2019, Ming Lei wrote:
> +/*
> + * Update average irq interval with the Exponential Weighted Moving
> + * Average(EWMA)
> + */
> +static void irq_update_interval(void)
> +{
> +#define IRQ_INTERVAL_EWMA_WEIGHT 128
> +#define IRQ_INTERVAL_EWMA_PREV_FACTOR127
> +#define IRQ_I
For some high performance IO devices, interrupt may come very frequently,
meantime IO request completion may take a bit time. Especially on some
devices(SCSI or NVMe), IO requests can be submitted concurrently from
multiple CPU cores, however IO completion is only done on one of
these submission CP
52 matches
Mail list logo