On Mon, 2014-03-24 at 06:17 -0700, Eric Dumazet wrote:
> On Mon, 2014-03-24 at 10:30 +, David Laight wrote:
> > ip_fast_csum() either needs an explicit "m" constraint for the actual
> > buffer (and target) bytes, or the stronger "memory" constraint.
> > The 'volatile' is then not needed.
I am
On Mon, 2014-03-24 at 10:30 +, David Laight wrote:
> From: Eric Dumazet
> > On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
> > > From: Eric Dumazet
> > > Date: Fri, 21 Mar 2014 05:50:50 -0700
> > >
> > > > It looks like a barrier() would be more appropriate.
> > >
> > > barrier() == __
From: Eric Dumazet
> On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
> > From: Eric Dumazet
> > Date: Fri, 21 Mar 2014 05:50:50 -0700
> >
> > > It looks like a barrier() would be more appropriate.
> >
> > barrier() == __asm__ __volatile__(:::"memory")
>
> Indeed, but now you mention it, ip
On Fri, 2014-03-21 at 14:52 -0400, David Miller wrote:
> From: Eric Dumazet
> Date: Fri, 21 Mar 2014 05:50:50 -0700
>
> > It looks like a barrier() would be more appropriate.
>
> barrier() == __asm__ __volatile__(:::"memory")
Indeed, but now you mention it, ip_fast_csum() do not uses volatile
k
From: Eric Dumazet
Date: Fri, 21 Mar 2014 05:50:50 -0700
> It looks like a barrier() would be more appropriate.
barrier() == __asm__ __volatile__(:::"memory")
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More maj
From: Eric Dumazet
> On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
> > Eric Dumazet writes:
> > >
> > > I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> > > is insane...
> >
> >
> > Couldn't it just be the cache miss?
>
> Or the fact that we mix 16 bit stores and 3
On Fri, 2014-03-21 at 06:47 -0700, Eric Dumazet wrote:
> Another idea would be to move the ip_fast_csum() call at the end of
> inet_gro_complete()
>
> I'll try this :
>
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 8c54870db792..0ca8f350a532 100644
> --- a/net/ipv4/af_inet.c
> +
On Fri, 2014-03-21 at 06:32 -0700, Eric Dumazet wrote:
> On Fri, 2014-03-21 at 05:50 -0700, Eric Dumazet wrote:
>
> > Or the fact that we mix 16 bit stores and 32bit loads ?
> >
> > iph->tot_len = newlen;
> > iph->check = 0;
> > iph->check = ip_fast_csum(iph, 5);
>
> Yep definitely. Using 16 bit
On Fri, 2014-03-21 at 05:50 -0700, Eric Dumazet wrote:
> Or the fact that we mix 16 bit stores and 32bit loads ?
>
> iph->tot_len = newlen;
> iph->check = 0;
> iph->check = ip_fast_csum(iph, 5);
Yep definitely. Using 16 bit loads in ip_fast_csum() totally removes the
stall. I no longer see inet_
On Fri, Mar 21, 2014 at 05:50:50AM -0700, Eric Dumazet wrote:
> On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
> > Eric Dumazet writes:
> > >
> > > I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> > > is insane...
> >
> >
> > Couldn't it just be the cache miss?
>
On Thu, 2014-03-20 at 18:56 -0700, Andi Kleen wrote:
> Eric Dumazet writes:
> >
> > I saw csum_partial() consuming 1% of cpu cycles in a GRO workload, that
> > is insane...
>
>
> Couldn't it just be the cache miss?
Or the fact that we mix 16 bit stores and 32bit loads ?
iph->tot_len = newlen;
11 matches
Mail list logo