On 07/13/2018 07:19 AM, Edward Cree wrote:
> On 12/07/18 21:10, Or Gerlitz wrote:
>> On Wed, Jul 11, 2018 at 11:06 PM, Jesper Dangaard Brouer
>> <bro...@redhat.com> wrote:
>>> One reason I didn't "just" send a patch, is that Edward so-fare only
>>> implemented netif_receive_skb_list() and not napi_gro_receive_list().
>> sfc does't support gro?! doesn't make sense.. Edward?
> sfc has a flag EFX_RX_PKT_TCP set according to bits in the RX event, we
> call napi_{get,gro}_frags() (via efx_rx_packet_gro()) for TCP packets and
> netif_receive_skb() (or now the list handling) (via efx_rx_deliver()) for
> non-TCP packets. So we avoid the GRO overhead for non-TCP workloads.
>
>> Same TCP performance
>>
>> with GRO and no rx-batching
>>
>> or
>>
>> without GRO and yes rx-batching
>>
>> is by far not intuitive result
> I'm also surprised by this. If I can find the time I'll try to do similar
> experiments on sfc.
> Jesper, are the CPU utilisations similar in both cases? You're sure your
> stream isn't TX-limited?
1) Make sure to test the case where packets of X flows are interleaved on the
wire,
instead of being nice with the receiver (trains of packets for each flow)
(Typical case on a fabric, since switches will mix the ingress traffic to one
egress port)
2) Do not test TCP_STREAM traffic, but TCP_RR
(RPC like traffic where GRO really cuts down number of ACK packets)
TCP_STREAM can hide the GRO gain, since ACK are naturally decimated under
sufficient
load.