On 10/04/2017 10:54 AM, Simon Horman wrote:

Add support for RX checksum offload. This is enabled by default and
may be disabled and re-enabled using ethtool:

  # ethtool -K eth0 rx off
  # ethtool -K eth0 rx on

The RAVB provides a simple checksumming scheme which appears to be
completely compatible with CHECKSUM_COMPLETE: sum of all packet data after
the L2 header is appended to packet data; this may be trivially read by the
driver and used to update the skb accordingly.

In terms of performance throughput is close to gigabit line-rate both with
and without RX checksum offload enabled. Perf output, however, appears to
indicate that significantly less time is spent in do_csum(). This is as
expected.

Test results with RX checksum offload enabled:
  # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 
10.4.3.162
  MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 
() port 0 AF_INET : demo
  enable_enobufs failed: getprotobyname
  Recv   Send    Send
  Socket Socket  Message  Elapsed
  Size   Size    Size     Time     Throughput
  bytes  bytes   bytes    secs.    10^6bits/sec

   87380  16384  16384    10.00     937.54

  Summary of output of perf report:
     18.28%      ksoftirqd/0  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     10.34%      ksoftirqd/0  [kernel.kallsyms]  [k] __pi_memcpy
      9.83%      ksoftirqd/0  [kernel.kallsyms]  [k] ravb_poll
      7.89%      ksoftirqd/0  [kernel.kallsyms]  [k] skb_put
      4.01%      ksoftirqd/0  [kernel.kallsyms]  [k] dev_gro_receive
      3.37%          netperf  [kernel.kallsyms]  [k] __arch_copy_to_user
      3.17%          swapper  [kernel.kallsyms]  [k] arch_cpu_idle
      2.55%          swapper  [kernel.kallsyms]  [k] tick_nohz_idle_enter
      2.04%      ksoftirqd/0  [kernel.kallsyms]  [k] __pi___inval_dcache_area
      2.03%          swapper  [kernel.kallsyms]  [k] _raw_spin_unlock_irq
      1.96%      ksoftirqd/0  [kernel.kallsyms]  [k] __netdev_alloc_skb
      1.59%      ksoftirqd/0  [kernel.kallsyms]  [k] __slab_alloc.isra.83

Test results without RX checksum offload enabled:
  # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 
10.4.3.162
  MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 
() port 0 AF_INET : demo
  enable_enobufs failed: getprotobyname
  Recv   Send    Send
  Socket Socket  Message  Elapsed
  Size   Size    Size     Time     Throughput
  bytes  bytes   bytes    secs.    10^6bits/sec

   87380  16384  16384    10.00     940.20

  Summary of output of perf report:
     17.10%    ksoftirqd/0  [kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
     10.99%    ksoftirqd/0  [kernel.kallsyms]  [k] __pi_memcpy
      8.87%    ksoftirqd/0  [kernel.kallsyms]  [k] ravb_poll
      8.16%    ksoftirqd/0  [kernel.kallsyms]  [k] skb_put
      7.42%    ksoftirqd/0  [kernel.kallsyms]  [k] do_csum
      3.91%    ksoftirqd/0  [kernel.kallsyms]  [k] dev_gro_receive
      2.31%        swapper  [kernel.kallsyms]  [k] arch_cpu_idle
      2.16%    ksoftirqd/0  [kernel.kallsyms]  [k] __pi___inval_dcache_area
      2.14%    ksoftirqd/0  [kernel.kallsyms]  [k] __netdev_alloc_skb
      1.93%        netperf  [kernel.kallsyms]  [k] __arch_copy_to_user
      1.79%        swapper  [kernel.kallsyms]  [k] tick_nohz_idle_enter
      1.63%    ksoftirqd/0  [kernel.kallsyms]  [k] __slab_alloc.isra.83

Above results collected on an R-Car Gen 3 Salvator-X/r8a7796 ES1.0.
Also tested on a R-Car Gen 3 Salvator-X/r8a7795 ES1.0.

By inspection this also appears to be compatible with the ravb found
on R-Car Gen 2 SoCs, however, this patch is currently untested on such
hardware.

Signed-off-by: Simon Horman <horms+rene...@verge.net.au>

---
v2
Address review of v1 by Sergei Shtylyov
* set features rather than oring them with (zero) existing values:
* Set/unset using a single call to ravb_modify()

  Already belated but still:

Reviewed-by: Sergei Shtylyov <sergei.shtyl...@cogentembedded.com>

MBR, Sergei

Reply via email to