On 2021/4/6 15:31, Michal Kubecek wrote:
> On Tue, Apr 06, 2021 at 10:46:29AM +0800, Yunsheng Lin wrote:
>> On 2021/4/6 9:49, Cong Wang wrote:
>>> On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote:
I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the
coming days.
On 2021/4/6 18:13, Juergen Gross wrote:
> On 06.04.21 09:06, Michal Kubecek wrote:
>> On Tue, Apr 06, 2021 at 08:55:41AM +0800, Yunsheng Lin wrote:
>>>
>>> Hi, Jiri
>>> Do you have a reproducer that can be shared here?
>>> With reproducer, I can debug and test it myself too.
>>
>> I'm afraid we are
On 06.04.21 09:06, Michal Kubecek wrote:
On Tue, Apr 06, 2021 at 08:55:41AM +0800, Yunsheng Lin wrote:
Hi, Jiri
Do you have a reproducer that can be shared here?
With reproducer, I can debug and test it myself too.
I'm afraid we are not aware of a simple reproducer. As mentioned in the
origin
On Tue, Apr 06, 2021 at 10:46:29AM +0800, Yunsheng Lin wrote:
> On 2021/4/6 9:49, Cong Wang wrote:
> > On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote:
> >>
> >> I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the
> >> coming days. If it works, then we can consider proceeding
On Tue, Apr 06, 2021 at 08:55:41AM +0800, Yunsheng Lin wrote:
>
> Hi, Jiri
> Do you have a reproducer that can be shared here?
> With reproducer, I can debug and test it myself too.
I'm afraid we are not aware of a simple reproducer. As mentioned in the
original discussion, the race window is ext
On 2021/4/6 9:49, Cong Wang wrote:
> On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote:
>>
>> I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the
>> coming days. If it works, then we can consider proceeding with it,
>> otherwise I am all for reverting the whole NOLOCK stuff.
>>
On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote:
>
> I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the
> coming days. If it works, then we can consider proceeding with it,
> otherwise I am all for reverting the whole NOLOCK stuff.
>
> [1]
> https://lore.kernel.org/linux-ca
On 2021/4/3 20:23, Jiri Kosina wrote:
> On Sat, 3 Apr 2021, Hillf Danton wrote:
>
> Sure. Seems they crept in over time. I had some plans to write a
> lockless HTB implementation. But with fq+EDT with BPF it seems that
> it is no longer needed, we have a more generic/better solution.
On Sat, 3 Apr 2021, Hillf Danton wrote:
> >>> Sure. Seems they crept in over time. I had some plans to write a
> >>> lockless HTB implementation. But with fq+EDT with BPF it seems that
> >>> it is no longer needed, we have a more generic/better solution. So
> >>> I dropped it. Also most folks sho
On 4/2/21 12:25 PM, Jiri Kosina wrote:
On Thu, 3 Sep 2020, John Fastabend wrote:
At this point I fear we could consider reverting the NOLOCK stuff.
I personally would hate doing so, but it looks like NOLOCK benefits are
outweighed by its issues.
I agree, NOLOCK brings more pains than gains. T
On Thu, 3 Sep 2020, John Fastabend wrote:
> > > At this point I fear we could consider reverting the NOLOCK stuff.
> > > I personally would hate doing so, but it looks like NOLOCK benefits are
> > > outweighed by its issues.
> >
> > I agree, NOLOCK brings more pains than gains. There are many rac
Sorry, guys, the experiment environment is no longer existing now. We
finally use fq_codel for online product.
Cong Wang 于2020年9月18日周五 上午3:52写道:
>
> On Sun, Sep 13, 2020 at 7:10 PM Yunsheng Lin wrote:
> >
> > On 2020/9/11 4:19, Cong Wang wrote:
> > > On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng w
On Sun, Sep 13, 2020 at 7:10 PM Yunsheng Lin wrote:
>
> On 2020/9/11 4:19, Cong Wang wrote:
> > On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng wrote:
> >> I also tried Cong's patch (shown below on my tree) and it could avoid
> >> the issue (stressing for 30 minutus for three times and not jitter
> >>
On 2020/9/11 4:19, Cong Wang wrote:
> On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng wrote:
>> I also tried Cong's patch (shown below on my tree) and it could avoid
>> the issue (stressing for 30 minutus for three times and not jitter
>> observed).
>
> Thanks for verifying it!
>
>>
>> --- ./include/
On Thu, 2020-09-10 at 14:07 -0700, John Fastabend wrote:
> Cong Wang wrote:
> > On Thu, Sep 3, 2020 at 10:08 PM John Fastabend
> > wrote:
> > > Maybe this would unlock us,
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 7df6c9617321..9b09429103f1 100644
> > > --- a/net/core/
Cong Wang wrote:
> On Thu, Sep 3, 2020 at 10:08 PM John Fastabend
> wrote:
> > Maybe this would unlock us,
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 7df6c9617321..9b09429103f1 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3749,7 +3749,7 @@ static inline int
On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng wrote:
> I also tried Cong's patch (shown below on my tree) and it could avoid
> the issue (stressing for 30 minutus for three times and not jitter
> observed).
Thanks for verifying it!
>
> --- ./include/net/sch_generic.h.orig 2020-08-21 15:13:51.787952
On Thu, Sep 3, 2020 at 10:08 PM John Fastabend wrote:
> Maybe this would unlock us,
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 7df6c9617321..9b09429103f1 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3749,7 +3749,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb,
Cong Wang wrote:
> On Thu, Sep 3, 2020 at 1:40 AM Paolo Abeni wrote:
> >
> > On Wed, 2020-09-02 at 22:01 -0700, Cong Wang wrote:
> > > Can you test the attached one-line fix? I think we are overthinking,
> > > probably all
> > > we need here is a busy wait.
> >
> > I think that will solve, but I a
Hi Hillf, Cong, Paolo,
Sorry for the late reply due to other urgent task.
I tried Hillf's patch (shown below on my tree) and it doesn't help and
the jitter shows up very quickly.
--- ./include/net/sch_generic.h.orig 2020-08-21 15:13:51.787952710 +0800
+++ ./include/net/sch_generic.h 2020-09-04 1
On Thu, Sep 3, 2020 at 1:40 AM Paolo Abeni wrote:
>
> On Wed, 2020-09-02 at 22:01 -0700, Cong Wang wrote:
> > Can you test the attached one-line fix? I think we are overthinking,
> > probably all
> > we need here is a busy wait.
>
> I think that will solve, but I also think that will kill NOLOCK
>
On Wed, 2020-09-02 at 22:01 -0700, Cong Wang wrote:
> Can you test the attached one-line fix? I think we are overthinking,
> probably all
> we need here is a busy wait.
I think that will solve, but I also think that will kill NOLOCK
performances due to really increased contention.
At this point I
Hello, Kehuan
Can you test the attached one-line fix? I think we are overthinking,
probably all
we need here is a busy wait.
Thanks.
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index d60e7c39d60c..fc1bacdb102b 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_gen
Hi Hillf,
Unfortunately, above mem barriers don't help. The issue shows up
within 1 minute ...
Hillf Danton 于2020年8月27日周四 下午8:58写道:
>
>
> On Thu, 27 Aug 2020 14:56:31 +0800 Kehuan Feng wrote:
> >
> > > Lets see if TCQ_F_NOLOC is making fq_codel different in your testing.
> >
> > I assume you me
Hi Hillf,
> Let’s see if TCQ_F_NOLOC is making fq_codel different in your testing.
I assume you meant disabling NOLOCK for pfifo_fast.
Here is the modification,
--- ./net/sched/sch_generic.c.orig 2020-08-24 22:02:04.589830751 +0800
+++ ./net/sched/sch_generic.c 2020-08-27 10:17:10.148977
Hi Hillf,
Thanks for the patch.
I just tried it and it looks better than previous one. The issue
appeared only once over ~30 mins stressing (without the patch , it
shows up within 1 mins in usual, so I feel like we are getting close
to the final fix)
(pasted the modifications on my tree in case of
Hi Hillf,
I just tried the updated version and the system can boot up now.
It does mitigate the issue a lot but still couldn't get rid of it
thoroughly. It seems to me like the effect of Cong's patch.
Hillf Danton 于2020年8月25日周二 上午11:23写道:
>
>
> Hi Feng,
>
> On Tue, 25 Aug 2020 10:18:05 +0800 Fe
Hillf,
With the latest version (attached what I have changed on my tree), the
system failed to start up with cpu stalled.
Hillf Danton 于2020年8月22日周六 上午11:30写道:
>
>
> On Thu, 20 Aug 2020 20:43:17 +0800 Hillf Danton wrote:
> > Hi Jike,
> >
> > On Thu, 20 Aug 2020 15:43:17 +0800 Jike Song wrote:
>
Hi Jike
On 8/20/20 12:43 AM, Jike Song wrote:
Hi Josh,
We met possibly the same problem when testing nvidia/mellanox's
GPUDirect RDMA product, we found that changing NET_SCH_DEFAULT to
DEFAULT_FQ_CODEL mitigated the problem, having no idea why. Maybe you
can also have a try?
We also did some
Hi Josh,
On Fri, Jul 3, 2020 at 2:14 AM Josh Hunt wrote:
{snip}
> Initial results with Cong's patch look promising, so far no stalls. We
> will let it run over the long weekend and report back on Tuesday.
>
> Paolo - I have concerns about possible performance regression with the
> change as well.
On Wed, 2020-07-08 at 13:16 -0700, Cong Wang wrote:
> On Tue, Jul 7, 2020 at 7:18 AM Paolo Abeni wrote:
> > So the regression with 2 pktgen threads is still relevant. 'perf' shows
> > relevant time spent into net_tx_action() and __netif_schedule().
>
> So, touching the __QDISC_STATE_SCHED bit in
On 7/2/20, 2:08 PM, "Josh Hunt" wrote:
>
> On 7/2/20 2:45 AM, Paolo Abeni wrote:
> > Hi all,
> >
> > On Thu, 2020-07-02 at 08:14 +0200, Jonas Bonn wrote:
> >> Hi Cong,
> >>
> >> On 01/07/2020 21:58, Cong Wang wrote:
> >>> On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote:
> On Tue, Jun 30, 202
On Tue, Jul 7, 2020 at 7:18 AM Paolo Abeni wrote:
> So the regression with 2 pktgen threads is still relevant. 'perf' shows
> relevant time spent into net_tx_action() and __netif_schedule().
So, touching the __QDISC_STATE_SCHED bit in __dev_xmit_skb() is
not a good idea.
Let me see if there is a
On Thu, 2020-07-02 at 11:08 -0700, Josh Hunt wrote:
> On 7/2/20 2:45 AM, Paolo Abeni wrote:
> > Hi all,
> >
> > On Thu, 2020-07-02 at 08:14 +0200, Jonas Bonn wrote:
> > > Hi Cong,
> > >
> > > On 01/07/2020 21:58, Cong Wang wrote:
> > > > On Wed, Jul 1, 2020 at 9:05 AM Cong Wang
> > > > wrote:
>
On 7/2/20 2:45 AM, Paolo Abeni wrote:
Hi all,
On Thu, 2020-07-02 at 08:14 +0200, Jonas Bonn wrote:
Hi Cong,
On 01/07/2020 21:58, Cong Wang wrote:
On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote:
On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote:
Do either of you know if there's been any deve
Hi all,
On Thu, 2020-07-02 at 08:14 +0200, Jonas Bonn wrote:
> Hi Cong,
>
> On 01/07/2020 21:58, Cong Wang wrote:
> > On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote:
> > > On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote:
> > > > Do either of you know if there's been any development on a fix for
Hi Cong,
On 01/07/2020 21:58, Cong Wang wrote:
On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote:
On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote:
Do either of you know if there's been any development on a fix for this
issue? If not we can propose something.
If you have a reproducer, I can l
On 7/1/20 12:58 PM, Cong Wang wrote:
On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote:
On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote:
Do either of you know if there's been any development on a fix for this
issue? If not we can propose something.
If you have a reproducer, I can look into
On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote:
>
> On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote:
> > Do either of you know if there's been any development on a fix for this
> > issue? If not we can propose something.
>
> If you have a reproducer, I can look into this.
Does the attached patch
On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote:
> Do either of you know if there's been any development on a fix for this
> issue? If not we can propose something.
If you have a reproducer, I can look into this.
Thanks.
On 30/06/2020 21:14, Josh Hunt wrote:
On 6/23/20 6:42 AM, Michael Zhivich wrote:
From: Jonas Bonn
To: Paolo Abeni ,
"netdev@vger.kernel.org" ,
LKML ,
"David S . Miller" ,
John Fastabend
Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc
Date: Fr
On 6/23/20 6:42 AM, Michael Zhivich wrote:
From: Jonas Bonn
To: Paolo Abeni ,
"netdev@vger.kernel.org" ,
LKML ,
"David S . Miller" ,
John Fastabend
Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc
Date: Fri, 11 Oct 2019 02:39
> From: Jonas Bonn
> To: Paolo Abeni ,
> "netdev@vger.kernel.org" ,
> LKML ,
> "David S . Miller" ,
> John Fastabend
> Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc
> Date: Fri, 11 Oct 2019 02:39:48 +0200
>
43 matches
Mail list logo