On 2/12/21 2:50 PM, Jesper Dangaard Brouer wrote:
> As Eric pointed out in response to commit 28af22c6c8df ("net: adjust
> net_device layout for cacheline usage") the netdev_features_t members
> wanted_features and hw_features are only used in control path.
> 
> Thus, this patch reorder the netdev_features_t to let more members that
> are used in fast path into the 3rd cacheline. Whether these members are
> read depend on SKB properties, which are hinted as comments. The member
> mpls_features could not fit in the cacheline, but it was the least
> commonly used (depend on CONFIG_NET_MPLS_GSO).
> 
> In the future we should consider relocating member gso_partial_features
> to be closer to member gso_max_segs. (see usage in gso_features_check()).
> 
> Suggested-by: Eric Dumazet <eduma...@google.com>
> Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
> ---
>  include/linux/netdevice.h |   11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index bfadf3b82f9c..3898bb167579 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1890,13 +1890,16 @@ struct net_device {
>       unsigned short          needed_headroom;
>       unsigned short          needed_tailroom;
>  
> +     /* Fast path features - via netif_skb_features */
>       netdev_features_t       features;
> +     netdev_features_t       vlan_features;       /* if skb_vlan_tagged */
> +     netdev_features_t       hw_enc_features;     /* if skb->encapsulation */
> +     netdev_features_t       gso_partial_features;/* if skb_is_gso */
> +     netdev_features_t       mpls_features; /* if eth_p_mpls+NET_MPLS_GSO */
> +
> +     /* Control path features */
>       netdev_features_t       hw_features;
>       netdev_features_t       wanted_features;
> -     netdev_features_t       vlan_features;
> -     netdev_features_t       hw_enc_features;
> -     netdev_features_t       mpls_features;
> -     netdev_features_t       gso_partial_features;
>  
>       unsigned int            min_mtu;
>       unsigned int            max_mtu;
> 
> 


Please also note we currently have at least 3 distinct blocks for tx path.

Presumably netdev_features_t are only used in TX, so should be grouped with the 
other TX
sections.


        /* --- cacheline 3 boundary (192 bytes) --- */       
...
        netdev_features_t          features;             /*  0xe0   0x8 */  

... Lots of ctrl stuff....


        /* --- cacheline 14 boundary (896 bytes) --- */
         struct netdev_queue *      _tx __attribute__((__aligned__(64))); /* 
0x380   0x8 */          


....

/* Mix of unrelated control stuff like rtnl_link_ops

 /* --- cacheline 31 boundary (1984 bytes) --- */ 
 unsigned int               gso_max_size;         /* 0x7c0   0x4 */
 u16                        gso_max_segs;         /* 0x7c4   0x2 */ 



Ideally we should move _all_ control/slow_path stuff at the very end of the 
structure,
in order to not pollute the cache lines we need for data path, to keep them as 
small
and packed as possible.

This could be done one field at a time, to ease code review.

We should have something like this 

/* section used in RX (fast) path */
/* section used in both RX/TX (fast) path */
/* section used in TX (fast) path */
/* section used for slow path, and control path */

Reply via email to