On Wed, Nov 13, 2013 at 01:32:50PM -, David Laight wrote:
> > > I'm not sure, whats the typical capacity for the branch predictors
> > > ability to remember code paths?
> ...
> >
> > For such simple single-target branches it goes near or over a thousand for
> > recent Intel and AMD microarchit
* David Laight wrote:
> > > I'm not sure, whats the typical capacity for the branch predictors
> > > ability to remember code paths?
> ...
> >
> > For such simple single-target branches it goes near or over a thousand
> > for recent Intel and AMD microarchitectures. Thousands for really
> >
> > I'm not sure, whats the typical capacity for the branch predictors
> > ability to remember code paths?
...
>
> For such simple single-target branches it goes near or over a thousand for
> recent Intel and AMD microarchitectures. Thousands for really recent CPUs.
IIRC the x86 can also correctl
* Neil Horman wrote:
> On Wed, Nov 13, 2013 at 10:09:51AM -, David Laight wrote:
> > > Sure, I modified the code so that we only prefetched 2 cache lines ahead,
> > > but
> > > only if the overall length of the input buffer is more than 2 cache lines.
> > > Below are the results (all counts
On Wed, Nov 13, 2013 at 10:09:51AM -, David Laight wrote:
> > Sure, I modified the code so that we only prefetched 2 cache lines ahead,
> > but
> > only if the overall length of the input buffer is more than 2 cache lines.
> > Below are the results (all counts are the average of 100 iterat
> Sure, I modified the code so that we only prefetched 2 cache lines ahead, but
> only if the overall length of the input buffer is more than 2 cache lines.
> Below are the results (all counts are the average of 100 iterations of the
> csum operation, as previous tests were, I just omitted that
On Tue, Nov 12, 2013 at 12:38:01PM -0800, Joe Perches wrote:
> On Tue, 2013-11-12 at 14:50 -0500, Neil Horman wrote:
> > On Tue, Nov 12, 2013 at 09:33:35AM -0800, Joe Perches wrote:
> > > On Tue, 2013-11-12 at 12:12 -0500, Neil Horman wrote:
> []
> > > > So, the numbers are correct now that I retur
On Tue, 2013-11-12 at 14:50 -0500, Neil Horman wrote:
> On Tue, Nov 12, 2013 at 09:33:35AM -0800, Joe Perches wrote:
> > On Tue, 2013-11-12 at 12:12 -0500, Neil Horman wrote:
[]
> > > So, the numbers are correct now that I returned my hardware to its
> > > previous
> > > interrupt affinity state,
On Tue, Nov 12, 2013 at 09:33:35AM -0800, Joe Perches wrote:
> On Tue, 2013-11-12 at 12:12 -0500, Neil Horman wrote:
> > On Mon, Nov 11, 2013 at 05:42:22PM -0800, Joe Perches wrote:
> > > Hi again Neil.
> > >
> > > Forwarding on to netdev with a concern as to how often
> > > do_csum is used via cs
On Tue, 2013-11-12 at 12:12 -0500, Neil Horman wrote:
> On Mon, Nov 11, 2013 at 05:42:22PM -0800, Joe Perches wrote:
> > Hi again Neil.
> >
> > Forwarding on to netdev with a concern as to how often
> > do_csum is used via csum_partial for very short headers
> > and what impact any prefetch would
On Mon, Nov 11, 2013 at 05:42:22PM -0800, Joe Perches wrote:
> Hi again Neil.
>
> Forwarding on to netdev with a concern as to how often
> do_csum is used via csum_partial for very short headers
> and what impact any prefetch would have there.
>
> Also, what changed in your test environment?
>
>
ded Message
> From: Neil Horman
> To: Joe Perches
> Cc: Dave Jones , linux-kernel@vger.kernel.org,
> sebastien.du...@bull.net, Thomas Gleixner , Ingo
> Molnar , H. Peter Anvin ,
> x...@kernel.org
> Subject: Re: [PATCH v2 2/2] x86: add prefetching to do_csum
>
>
Cc: Dave Jones , linux-kernel@vger.kernel.org,
sebastien.du...@bull.net, Thomas Gleixner , Ingo
Molnar , H. Peter Anvin ,
x...@kernel.org
Subject: Re: [PATCH v2 2/2] x86: add prefetching to do_csum
On Fri, Nov 08, 2013 at 12:29:07PM -0800, Joe Perches wrote:
> On Fri, 2013-11-08 at 15:14 -0
* Neil Horman wrote:
> Ingo, does that seem reasonable to you?
FYI, in the past few days I've been busy due to the merge window, but
everything I've seen so far in this portion of the thread gave me warm
fuzzy feelings, so I definitely like the direction.
(More once I get around to looking a
On Fri, Nov 08, 2013 at 12:29:07PM -0800, Joe Perches wrote:
> On Fri, 2013-11-08 at 15:14 -0500, Neil Horman wrote:
> > On Fri, Nov 08, 2013 at 11:33:13AM -0800, Joe Perches wrote:
> > > On Fri, 2013-11-08 at 14:01 -0500, Neil Horman wrote:
> > > > On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Per
On Fri, 2013-11-08 at 15:14 -0500, Neil Horman wrote:
> On Fri, Nov 08, 2013 at 11:33:13AM -0800, Joe Perches wrote:
> > On Fri, 2013-11-08 at 14:01 -0500, Neil Horman wrote:
> > > On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
> > > > On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wr
On Fri, Nov 08, 2013 at 11:33:13AM -0800, Joe Perches wrote:
> On Fri, 2013-11-08 at 14:01 -0500, Neil Horman wrote:
> > On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
> > > On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wrote:
> > > > On Wed, Nov 06, 2013 at 10:34:29AM -0500, Dave Jo
On Fri, Nov 08, 2013 at 11:17:39AM -0800, Joe Perches wrote:
> On Fri, 2013-11-08 at 14:07 -0500, Neil Horman wrote:
> > On Fri, Nov 08, 2013 at 08:51:07AM -0800, Joe Perches wrote:
> > > On Fri, 2013-11-08 at 11:25 -0500, Neil Horman wrote:
> > > > On Wed, Nov 06, 2013 at 12:07:38PM -0800, Joe Per
On Fri, 2013-11-08 at 14:01 -0500, Neil Horman wrote:
> On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
> > On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wrote:
> > > On Wed, Nov 06, 2013 at 10:34:29AM -0500, Dave Jones wrote:
> > > > On Wed, Nov 06, 2013 at 10:23:19AM -0500, Neil Hor
On 11/08/2013 11:07 AM, Neil Horman wrote:
> On Fri, Nov 08, 2013 at 08:51:07AM -0800, Joe Perches wrote:
>> On Fri, 2013-11-08 at 11:25 -0500, Neil Horman wrote:
>>> On Wed, Nov 06, 2013 at 12:07:38PM -0800, Joe Perches wrote:
On Wed, 2013-11-06 at 15:02 -0500, Neil Horman wrote:
> On Wed
On Fri, 2013-11-08 at 14:07 -0500, Neil Horman wrote:
> On Fri, Nov 08, 2013 at 08:51:07AM -0800, Joe Perches wrote:
> > On Fri, 2013-11-08 at 11:25 -0500, Neil Horman wrote:
> > > On Wed, Nov 06, 2013 at 12:07:38PM -0800, Joe Perches wrote:
> > > > On Wed, 2013-11-06 at 15:02 -0500, Neil Horman wr
On Fri, Nov 08, 2013 at 08:51:07AM -0800, Joe Perches wrote:
> On Fri, 2013-11-08 at 11:25 -0500, Neil Horman wrote:
> > On Wed, Nov 06, 2013 at 12:07:38PM -0800, Joe Perches wrote:
> > > On Wed, 2013-11-06 at 15:02 -0500, Neil Horman wrote:
> > > > On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Per
On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
> On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wrote:
> > On Wed, Nov 06, 2013 at 10:34:29AM -0500, Dave Jones wrote:
> > > On Wed, Nov 06, 2013 at 10:23:19AM -0500, Neil Horman wrote:
> > > > do_csum was identified via perf recently a
On Fri, 2013-11-08 at 11:25 -0500, Neil Horman wrote:
> On Wed, Nov 06, 2013 at 12:07:38PM -0800, Joe Perches wrote:
> > On Wed, 2013-11-06 at 15:02 -0500, Neil Horman wrote:
> > > On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
> > []
> > > > __always_inline instead of inline
> > > >
On Wed, Nov 06, 2013 at 12:07:38PM -0800, Joe Perches wrote:
> On Wed, 2013-11-06 at 15:02 -0500, Neil Horman wrote:
> > On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
> []
> > > __always_inline instead of inline
> > > static __always_inline void prefetch_lines(const void *addr, size_
On Wed, Nov 06, 2013 at 12:19:52PM -0800, Andi Kleen wrote:
> Neil Horman writes:
>
> > do_csum was identified via perf recently as a hot spot when doing
> > receive on ip over infiniband workloads. After alot of testing and
> > ideas, we found the best optimization available to us currently is
Neil Horman writes:
> do_csum was identified via perf recently as a hot spot when doing
> receive on ip over infiniband workloads. After alot of testing and
> ideas, we found the best optimization available to us currently is to
> prefetch the entire data buffer prior to doing the checksum
On w
On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
> On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wrote:
> > On Wed, Nov 06, 2013 at 10:34:29AM -0500, Dave Jones wrote:
> > > On Wed, Nov 06, 2013 at 10:23:19AM -0500, Neil Horman wrote:
> > > > do_csum was identified via perf recently a
On Wed, 2013-11-06 at 15:02 -0500, Neil Horman wrote:
> On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
[]
> > __always_inline instead of inline
> > static __always_inline void prefetch_lines(const void *addr, size_t len)
> > {
> > const void *end = addr + len;
> > ...
> >
> > buf
On Wed, Nov 06, 2013 at 10:23:10AM -0800, Eric Dumazet wrote:
> On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wrote:
>
> > My guess was that the whole comment was made in reference to the fact that
> > checksum offload negated all these advantages. Thats not so true anymore,
> > since
> > infin
On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wrote:
> My guess was that the whole comment was made in reference to the fact that
> checksum offload negated all these advantages. Thats not so true anymore,
> since
> infiniband needs csum in software for ipoib.
>
> I'll fix this up and send a v
On Wed, Nov 06, 2013 at 09:19:23AM -0800, Joe Perches wrote:
> On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wrote:
> > On Wed, Nov 06, 2013 at 10:34:29AM -0500, Dave Jones wrote:
> > > On Wed, Nov 06, 2013 at 10:23:19AM -0500, Neil Horman wrote:
> > > > do_csum was identified via perf recently a
On Wed, 2013-11-06 at 10:54 -0500, Neil Horman wrote:
> On Wed, Nov 06, 2013 at 10:34:29AM -0500, Dave Jones wrote:
> > On Wed, Nov 06, 2013 at 10:23:19AM -0500, Neil Horman wrote:
> > > do_csum was identified via perf recently as a hot spot when doing
> > > receive on ip over infiniband workload
On Wed, Nov 06, 2013 at 10:34:29AM -0500, Dave Jones wrote:
> On Wed, Nov 06, 2013 at 10:23:19AM -0500, Neil Horman wrote:
> > do_csum was identified via perf recently as a hot spot when doing
> > receive on ip over infiniband workloads. After alot of testing and
> > ideas, we found the best op
On Wed, Nov 06, 2013 at 10:23:19AM -0500, Neil Horman wrote:
> do_csum was identified via perf recently as a hot spot when doing
> receive on ip over infiniband workloads. After alot of testing and
> ideas, we found the best optimization available to us currently is to
> prefetch the entire da
do_csum was identified via perf recently as a hot spot when doing
receive on ip over infiniband workloads. After alot of testing and
ideas, we found the best optimization available to us currently is to
prefetch the entire data buffer prior to doing the checksum
Signed-off-by: Neil Horman
CC: se
36 matches
Mail list logo