On Wed, 2016-11-30 at 18:50 -0800, Eric Dumazet wrote:
> On Wed, 2016-11-30 at 18:32 -0800, Eric Dumazet wrote:
>
> > I simply suggest we try to queue the qdisc for further servicing as we
> > do today, from net_tx_action(), but we might use a different bit, so
> > that we leave the opportunity fo
On Thu, 2016-12-01 at 09:04 -0800, Eric Dumazet wrote:
> On Thu, 2016-12-01 at 17:04 +0100, Jesper Dangaard Brouer wrote:
>
> > I think you misunderstood my concept[1]. I don't want to stop the
> > queue. The new __QUEUE_STATE_FLUSH_NEEDED does not stop the queue, is
> > it just indicating that s
On Thu, 2016-12-01 at 15:20 -0500, David Miller wrote:
> From: Eric Dumazet
> Date: Thu, 01 Dec 2016 09:04:17 -0800
>
> > On Thu, 2016-12-01 at 17:04 +0100, Jesper Dangaard Brouer wrote:
> >
> >> When qdisc layer or trafgen/af_packet see this indication it knows it
> >> should/must flush the que
On Thu, 2016-12-01 at 13:32 -0800, Alexander Duyck wrote:
> A few years back when I did something like this on ixgbe I was told by
> you that the issue was that doing something like this would add too
> much latency. I was just wondering what the latency impact is on a
> change like this and if t
On Mon, Nov 28, 2016 at 10:58 PM, Eric Dumazet wrote:
> On Mon, 2016-11-21 at 10:10 -0800, Eric Dumazet wrote:
>
>
>> Not sure it this has been tried before, but the doorbell avoidance could
>> be done by the driver itself, because it knows a TX completion will come
>> shortly (well... if softirqs
From: Eric Dumazet
Date: Thu, 01 Dec 2016 09:04:17 -0800
> On Thu, 2016-12-01 at 17:04 +0100, Jesper Dangaard Brouer wrote:
>
>> When qdisc layer or trafgen/af_packet see this indication it knows it
>> should/must flush the queue when it don't have more work left. Perhaps
>> through net_tx_acti
On Thu, 2016-12-01 at 20:17 +0100, Jesper Dangaard Brouer wrote:
> On Thu, 01 Dec 2016 09:04:17 -0800 Eric Dumazet
> wrote:
>
> > BTW, if you are doing tests on mlx4 40Gbit,
>
> I'm mostly testing with mlx5 50Gbit, but I do have 40G NIC in the
> machines too.
>
> > would you check the
> > fol
>>> > So we end up with one cpu doing the ndo_start_xmit() and TX completions,
>>> > and RX work.
This problem is somewhat tangential to the doorbell avoidance discussion.
>>> >
>>> > This problem is magnified when XPS is used, if one mono-threaded
>>> > application deals with
>>> > thousands of
On Thu, 01 Dec 2016 09:04:17 -0800 Eric Dumazet wrote:
> BTW, if you are doing tests on mlx4 40Gbit,
I'm mostly testing with mlx5 50Gbit, but I do have 40G NIC in the
machines too.
> would you check the
> following quick/dirty hack, using lots of low-rate flows ?
What tool should I use to se
On Thu, 2016-12-01 at 17:04 +0100, Jesper Dangaard Brouer wrote:
> I think you misunderstood my concept[1]. I don't want to stop the
> queue. The new __QUEUE_STATE_FLUSH_NEEDED does not stop the queue, is
> it just indicating that someone need to flush/ring-doorbell. Maybe it
> need another name
On Thu, 01 Dec 2016 06:24:34 -0800
Eric Dumazet wrote:
> On Thu, 2016-12-01 at 13:05 +0100, Jesper Dangaard Brouer wrote:
> > On Wed, 30 Nov 2016 18:27:45 +0200
> > Saeed Mahameed wrote:
> >
> > > >> All in all, this is risky business :), the right way to go is to
> > > >> force the upper la
On Thu, 2016-12-01 at 13:05 +0100, Jesper Dangaard Brouer wrote:
> On Wed, 30 Nov 2016 18:27:45 +0200
> Saeed Mahameed wrote:
>
> > >> All in all, this is risky business :), the right way to go is to
> > >> force the upper layer to use xmit-more and delay doorbells/use bulking
> > >> but from th
On Wed, 30 Nov 2016 18:27:45 +0200
Saeed Mahameed wrote:
> >> All in all, this is risky business :), the right way to go is to
> >> force the upper layer to use xmit-more and delay doorbells/use bulking
> >> but from the same context (xmit routine). For example see
> >> Achiad's suggestion (at
On Wed, Nov 30, 2016 at 6:32 PM, Eric Dumazet wrote:
> On Wed, 2016-11-30 at 17:16 -0800, Tom Herbert wrote:
>> On Wed, Nov 30, 2016 at 4:27 PM, Eric Dumazet wrote:
>> >
>> > Another issue I found during my tests last days, is a problem with BQL,
>> > and more generally when driver stops/starts t
On Wed, 2016-11-30 at 18:32 -0800, Eric Dumazet wrote:
> I simply suggest we try to queue the qdisc for further servicing as we
> do today, from net_tx_action(), but we might use a different bit, so
> that we leave the opportunity for another cpu to get __QDISC_STATE_SCHED
> before we grab it from
On Wed, 2016-11-30 at 17:16 -0800, Tom Herbert wrote:
> On Wed, Nov 30, 2016 at 4:27 PM, Eric Dumazet wrote:
> >
> > Another issue I found during my tests last days, is a problem with BQL,
> > and more generally when driver stops/starts the queue.
> >
> > When under stress and BQL stops the queue,
On Wed, Nov 30, 2016 at 4:27 PM, Eric Dumazet wrote:
>
> Another issue I found during my tests last days, is a problem with BQL,
> and more generally when driver stops/starts the queue.
>
> When under stress and BQL stops the queue, driver TX completion does a
> lot of work, and servicing CPU also
Another issue I found during my tests last days, is a problem with BQL,
and more generally when driver stops/starts the queue.
When under stress and BQL stops the queue, driver TX completion does a
lot of work, and servicing CPU also takes over further qdisc_run().
The work-flow is :
1) collect
On Wed, 2016-11-30 at 23:30 +0100, Jesper Dangaard Brouer wrote:
> On Wed, 30 Nov 2016 11:30:00 -0800
> Eric Dumazet wrote:
>
> > On Wed, 2016-11-30 at 20:17 +0100, Jesper Dangaard Brouer wrote:
> >
> > > Don't take is as critique Eric. I was hoping your patch would have
> > > solved this issue
On Wed, 30 Nov 2016 11:30:00 -0800
Eric Dumazet wrote:
> On Wed, 2016-11-30 at 20:17 +0100, Jesper Dangaard Brouer wrote:
>
> > Don't take is as critique Eric. I was hoping your patch would have
> > solved this issue of being sensitive to TX completion adjustments. You
> > usually have good so
On Wed, 2016-11-30 at 20:17 +0100, Jesper Dangaard Brouer wrote:
> Don't take is as critique Eric. I was hoping your patch would have
> solved this issue of being sensitive to TX completion adjustments. You
> usually have good solutions for difficult issues. I basically rejected
> Achiad's appro
On Wed, 30 Nov 2016 07:56:26 -0800
Eric Dumazet wrote:
> On Wed, 2016-11-30 at 12:38 +0100, Jesper Dangaard Brouer wrote:
> > I've played with a somewhat similar patch (from Achiad Shochat) for
> > mlx5 (attached). While it gives huge improvements, the problem I ran
> > into was that; TX perform
On Wed, 2016-11-30 at 18:27 +0200, Saeed Mahameed wrote:
>
> In this case, i think they should implement their own bulking (pktgen
> is not a good example)
> but XDP can predict if it has more packets to xmit as long as all of
> them fall in the same NAPI cycle.
> Others should try and do the
On Wed, Nov 30, 2016 at 5:44 PM, Eric Dumazet wrote:
> On Wed, 2016-11-30 at 15:50 +0200, Saeed Mahameed wrote:
>> On Tue, Nov 29, 2016 at 8:58 AM, Eric Dumazet wrote:
>> > On Mon, 2016-11-21 at 10:10 -0800, Eric Dumazet wrote:
>> >
>> >
>> >> Not sure it this has been tried before, but the doorb
On Wed, 2016-11-30 at 12:38 +0100, Jesper Dangaard Brouer wrote:
> I've played with a somewhat similar patch (from Achiad Shochat) for
> mlx5 (attached). While it gives huge improvements, the problem I ran
> into was that; TX performance became a function of the TX completion
> time/interrupt and
On Tue, 2016-11-29 at 23:28 -0800, Alexei Starovoitov wrote:
> On Mon, Nov 28, 2016 at 10:58 PM, Eric Dumazet wrote:
> > {
> > @@ -496,8 +531,13 @@ static bool mlx4_en_process_tx_cq(struct net_device
> > *dev,
> > wmb();
> >
> > /* we want to dirty this cache line once */
> > -
On Wed, 2016-11-30 at 15:50 +0200, Saeed Mahameed wrote:
> On Tue, Nov 29, 2016 at 8:58 AM, Eric Dumazet wrote:
> > On Mon, 2016-11-21 at 10:10 -0800, Eric Dumazet wrote:
> >
> >
> >> Not sure it this has been tried before, but the doorbell avoidance could
> >> be done by the driver itself, becaus
On Tue, Nov 29, 2016 at 8:58 AM, Eric Dumazet wrote:
> On Mon, 2016-11-21 at 10:10 -0800, Eric Dumazet wrote:
>
>
>> Not sure it this has been tried before, but the doorbell avoidance could
>> be done by the driver itself, because it knows a TX completion will come
>> shortly (well... if softirqs
I've played with a somewhat similar patch (from Achiad Shochat) for
mlx5 (attached). While it gives huge improvements, the problem I ran
into was that; TX performance became a function of the TX completion
time/interrupt and could easily be throttled if configured too
high/slow.
Can your patch b
On Mon, Nov 28, 2016 at 10:58 PM, Eric Dumazet wrote:
> {
> @@ -496,8 +531,13 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
> wmb();
>
> /* we want to dirty this cache line once */
> - ACCESS_ONCE(ring->last_nr_txbb) = last_nr_txbb;
> - ACCESS_ONCE(ring-
On Mon, 2016-11-21 at 10:10 -0800, Eric Dumazet wrote:
> Not sure it this has been tried before, but the doorbell avoidance could
> be done by the driver itself, because it knows a TX completion will come
> shortly (well... if softirqs are not delayed too much !)
>
> Doorbell would be forced onl
31 matches
Mail list logo