Hi,

On Tue, 2019-06-25 at 14:42 +0800, Zhiyuan Hou wrote:
> In ipvlan l3s mode,  ingress packet is switched to slave interface and
> delivers to l4 stack. This may cause two problems:
> 
>   1. When slave is in an ns different from master, the behavior of stack
>   in slave ns may cause confusion for users. For example, iptables, tc,
>   and other l2/l3 functions are not available for ingress packet.
> 
>   2. l3s mode is not used for tap device, and cannot support ipvtap. But
>   in VM or container based VM cases, tap device is a very common device.
> 
> In l3s mode's input nf_hook, this patch calles the skb_forward_dev() to
> forward ingress packet to slave and uses nf_conntrack_confirm() to make
> conntrack work with new mode.
> 
> Signed-off-by: Zha Bin <zha...@linux.alibaba.com>
> Signed-off-by: Zhiyuan Hou <zhiyuan2...@linux.alibaba.com>
> ---
>  drivers/net/ipvlan/ipvlan.h     |  9 ++++++++-
>  drivers/net/ipvlan/ipvlan_l3s.c | 16 ++++++++++++++--
>  2 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ipvlan/ipvlan.h b/drivers/net/ipvlan/ipvlan.h
> index 3837c897832e..48c814e24c3f 100644
> --- a/drivers/net/ipvlan/ipvlan.h
> +++ b/drivers/net/ipvlan/ipvlan.h
> @@ -172,6 +172,14 @@ void ipvlan_link_delete(struct net_device *dev, struct 
> list_head *head);
>  void ipvlan_link_setup(struct net_device *dev);
>  int ipvlan_link_register(struct rtnl_link_ops *ops);
>  #ifdef CONFIG_IPVLAN_L3S
> +
> +#include <net/netfilter/nf_conntrack_core.h>
> +
> +static inline int ipvlan_confirm_conntrack(struct sk_buff *skb)
> +{
> +     return nf_conntrack_confirm(skb);
> +}
> +
>  int ipvlan_l3s_register(struct ipvl_port *port);
>  void ipvlan_l3s_unregister(struct ipvl_port *port);
>  void ipvlan_migrate_l3s_hook(struct net *oldnet, struct net *newnet);
> @@ -206,5 +214,4 @@ static inline bool netif_is_ipvlan_port(const struct 
> net_device *dev)
>  {
>       return rcu_access_pointer(dev->rx_handler) == ipvlan_handle_frame;
>  }
> -
>  #endif /* __IPVLAN_H */
> diff --git a/drivers/net/ipvlan/ipvlan_l3s.c b/drivers/net/ipvlan/ipvlan_l3s.c
> index 943d26cbf39f..ed210002f593 100644
> --- a/drivers/net/ipvlan/ipvlan_l3s.c
> +++ b/drivers/net/ipvlan/ipvlan_l3s.c
> @@ -95,14 +95,26 @@ static unsigned int ipvlan_nf_input(void *priv, struct 
> sk_buff *skb,
>  {
>       struct ipvl_addr *addr;
>       unsigned int len;
> +     int ret = NF_ACCEPT;
> +     bool success;
>  
>       addr = ipvlan_skb_to_addr(skb, skb->dev);
>       if (!addr)
>               goto out;
>  
> -     skb->dev = addr->master->dev;
>       len = skb->len + ETH_HLEN;
> -     ipvlan_count_rx(addr->master, len, true, false);
> +
> +     ret = ipvlan_confirm_conntrack(skb);
> +     if (ret != NF_ACCEPT) {
> +             ipvlan_count_rx(addr->master, len, false, false);
> +             goto out;
> +     }
> +
> +     skb_push_rcsum(skb, ETH_HLEN);
> +     success = dev_forward_skb(addr->master->dev, skb) == NET_RX_SUCCESS;

This looks weird to me: if I read the code correctly, the skb will
traverse twice NF_INET_LOCAL_IN, once due to the l3s hooking and
another one due to dev_forward_skb().

Also, tc ingreess, etc will run after the first traversing of
NF_INET_LOCAL_IN.

All in all I think that if full l2 processing is required, a different
mode or a different virtual device should be used.

Cheers,

Paolo

Reply via email to