Dear Network Core developers! I've been debugging an issue with Multicast replies from underlying interface of MACVLAN towards MACVLAN. These SKBs never contain a MAC header and therefore cannot be properly processed by MACVLAN.
The usecase is following: eth1 <-- eth1.212 <-- macvlan@eth1.212 (in bridge mode) As I understand the problem, it actually plays no role, that there is an intermediate VLAN interface. The problem is, if macvlan@eth1.212 sends Router Solicitation these SKBs are received on eth1.212, but the corresponding multicast Router Advertisements are not received on macvlan@eth1.212. I've tracked the problem down to the following incompatibility between MACVLAN code and IP code... One the one hand, MACVLAN always expects ethernet header: static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb) { struct macvlan_port *port; struct sk_buff *skb = *pskb; const struct ethhdr *eth = eth_hdr(skb); ... port = macvlan_port_get_rcu(skb->dev); if (is_multicast_ether_addr(eth->h_dest)) { One the other hand, IP doesn't populate ethernet header for multicast loopback transmission: int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *skb) { skb_reset_mac_header(skb); __skb_pull(skb, skb_network_offset(skb)); skb->pkt_type = PACKET_LOOPBACK; skb->ip_summed = CHECKSUM_UNNECESSARY; WARN_ON(!skb_dst(skb)); skb_dst_force(skb); netif_rx_ni(skb); Unicast however works fine, because of: int neigh_connected_output(struct neighbour *neigh, struct sk_buff *skb) { struct net_device *dev = neigh->dev; unsigned int seq; int err; do { __skb_pull(skb, skb_network_offset(skb)); seq = read_seqbegin(&neigh->ha_lock); err = dev_hard_header(skb, dev, ntohs(skb->protocol), neigh->ha, NULL, skb->len); } while (read_seqretry(&neigh->ha_lock, seq)); if (err >= 0) err = dev_queue_xmit(skb); I've also collected some stack traces and SKB dumps to illustrate the problem (I've instrumented macvlan_handle_frame() and eth_header() to understand when the ethernet header has been generated): macvlan_handle_frame() receives Router Advertisement, but cannot forward without Ethernet header: skb len=96 headroom=40 headlen=96 tailroom=56 mac=(40,0) net=(40,40) trans=80 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0xae2e9a2f ip_summed=1 complete_sw=0 valid=0 level=0) hash(0xc97ebd88 sw=1 l4=1) proto=0x86dd pkttype=5 iif=24 dev name=etha01.212 feat=0x0x0000000040005000 skb headroom: 00000000: 00 28 b3 4d 84 88 ff ff b2 72 b9 5e 00 00 00 00 skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 skb headroom: 00000020: 08 0f 00 00 00 00 00 00 skb linear: 00000000: 60 09 88 bd 00 38 3a ff fe 80 00 00 00 00 00 00 skb linear: 00000010: 00 40 43 ff fe 80 00 00 ff 02 00 00 00 00 00 00 skb linear: 00000020: 00 00 00 00 00 00 00 01 86 00 61 00 40 00 00 2d skb linear: 00000030: 00 00 00 00 00 00 00 00 03 04 40 e0 00 00 01 2c skb linear: 00000040: 00 00 00 78 00 00 00 00 fd 5f 42 68 23 87 a8 81 skb linear: 00000050: 00 00 00 00 00 00 00 00 01 01 02 40 43 80 00 00 skb tailroom: 00000000: 00 f0 01 00 00 00 00 00 a4 73 00 00 00 00 00 00 skb tailroom: 00000010: a4 73 00 00 00 00 00 00 00 10 00 00 00 00 00 00 skb tailroom: 00000020: 01 00 00 00 06 00 00 00 40 66 02 00 00 00 00 00 skb tailroom: 00000030: 40 76 02 00 00 00 00 00 Call Trace: <IRQ> dump_stack+0x69/0x9b macvlan_handle_frame+0x321/0x425 [macvlan] ? macvlan_forward_source+0x110/0x110 [macvlan] __netif_receive_skb_core+0x545/0xda0 ? ip6_mc_input+0x103/0x250 [ipv6] ? ipv6_rcv+0xe1/0xf0 [ipv6] ? __netif_receive_skb_one_core+0x36/0x70 __netif_receive_skb_one_core+0x36/0x70 process_backlog+0x97/0x140 net_rx_action+0x1eb/0x350 __do_softirq+0xe3/0x383 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq.part.4+0x4e/0x50 netif_rx_ni+0x60/0xd0 dev_loopback_xmit+0x83/0xf0 ip6_finish_output2+0x575/0x590 [ipv6] ? ip6_cork_release.isra.1+0x64/0x90 [ipv6] ? __ip6_make_skb+0x38d/0x680 [ipv6] ? ip6_output+0x6c/0x140 [ipv6] ip6_output+0x6c/0x140 [ipv6] ip6_send_skb+0x1e/0x60 [ipv6] rawv6_sendmsg+0xc4b/0xe10 [ipv6] ? proc_put_long+0xd0/0xd0 ? rw_copy_check_uvector+0x4e/0x110 ? sock_sendmsg+0x36/0x40 sock_sendmsg+0x36/0x40 ___sys_sendmsg+0x2b6/0x2d0 ? proc_dointvec+0x23/0x30 ? addrconf_sysctl_forward+0x8d/0x250 [ipv6] ? dev_forward_change+0x130/0x130 [ipv6] ? _raw_spin_unlock+0x12/0x30 ? proc_sys_call_handler.isra.14+0x9f/0x110 ? __call_rcu+0x213/0x510 ? get_max_files+0x10/0x10 ? trace_hardirqs_on+0x2c/0xe0 ? __sys_sendmsg+0x63/0xa0 __sys_sendmsg+0x63/0xa0 do_syscall_64+0x6c/0x1e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Later when the same RA is being transmitted neigh_connected_output(), this is the first time Ethernet header is being generated for this packet, but this is towards "world", not the internal MACVLAN bridge: skb len=110 headroom=26 headlen=110 tailroom=56 mac=(-1,-1) net=(40,40) trans=80 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0xae2e9a2f ip_summed=0 complete_sw=0 valid=0 level=0) hash(0xc97ebd88 sw=1 l4=1) proto=0x86dd pkttype=0 iif=0 dev name=etha01.212 feat=0x0x0000000040005000 sk family=10 type=3 proto=58 skb headroom: 00000000: 00 28 b3 4d 84 88 ff ff b2 72 b9 5e 00 00 00 00 skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 skb linear: 00000000: 33 33 00 00 00 01 02 40 43 80 00 00 86 dd 60 09 skb linear: 00000010: 88 bd 00 38 3a ff fe 80 00 00 00 00 00 00 00 40 skb linear: 00000020: 43 ff fe 80 00 00 ff 02 00 00 00 00 00 00 00 00 skb linear: 00000030: 00 00 00 00 00 01 86 00 61 00 40 00 00 2d 00 00 skb linear: 00000040: 00 00 00 00 00 00 03 04 40 e0 00 00 01 2c 00 00 skb linear: 00000050: 00 78 00 00 00 00 fd 5f 42 68 23 87 a8 81 00 00 skb linear: 00000060: 00 00 00 00 00 00 01 01 02 40 43 80 00 00 skb tailroom: 00000000: 00 f0 01 00 00 00 00 00 a4 73 00 00 00 00 00 00 skb tailroom: 00000010: a4 73 00 00 00 00 00 00 00 10 00 00 00 00 00 00 skb tailroom: 00000020: 01 00 00 00 06 00 00 00 40 66 02 00 00 00 00 00 skb tailroom: 00000030: 40 76 02 00 00 00 00 00 Call Trace: dump_stack+0x69/0x9b debug_hdr+0x4c/0x60 eth_header+0x71/0xe0 vlan_dev_hard_header+0x58/0x140 [8021q] neigh_connected_output+0xa9/0x100 ip6_finish_output2+0x24a/0x590 [ipv6] ? ip6_cork_release.isra.1+0x64/0x90 [ipv6] ? __ip6_make_skb+0x38d/0x680 [ipv6] ? ip6_output+0x6c/0x140 [ipv6] ip6_output+0x6c/0x140 [ipv6] ip6_send_skb+0x1e/0x60 [ipv6] rawv6_sendmsg+0xc4b/0xe10 [ipv6] ? proc_put_long+0xd0/0xd0 ? rw_copy_check_uvector+0x4e/0x110 ? sock_sendmsg+0x36/0x40 sock_sendmsg+0x36/0x40 ___sys_sendmsg+0x2b6/0x2d0 ? proc_dointvec+0x23/0x30 ? addrconf_sysctl_forward+0x8d/0x250 [ipv6] ? dev_forward_change+0x130/0x130 [ipv6] ? _raw_spin_unlock+0x12/0x30 ? proc_sys_call_handler.isra.14+0x9f/0x110 ? __call_rcu+0x213/0x510 ? get_max_files+0x10/0x10 ? trace_hardirqs_on+0x2c/0xe0 ? __sys_sendmsg+0x63/0xa0 __sys_sendmsg+0x63/0xa0 do_syscall_64+0x6c/0x1e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe I would appreciate any hint, how to approach this problem! I can try to come up with a patch, but as this is so central thing in the IP protocol, I'd like to hear some opinions first... -- Best regards, Alexander Sverdlin.