Thanks to Mitchell for figuring this out.

> On 3 Apr 2019, at 05:25, Lee Nelson <lnel...@nelnet.org> wrote:
> 
> Since Mitchell's last email, this appeared from CVS in the place where
> the patch was supposed to be applied:
> 
> CLR(m0->m_flags, M_BCAST|M_MCAST);
> 
> I skipped the patch and compiled the kernel with the source as I found
> it from CVS.  With this new kernel everything works as I expected. arp
> broadcast requests coming into the bridge on one mpw are being seen by
> the router on the other mpw and arp replies are getting back to the
> requesting router.
> 
> Thank you to everyone!!!
> 
> On Tue, Apr 2, 2019 at 4:52 AM Mitchell Krome <mitchellkr...@gmail.com> wrote:
>> 
>> 
>> 
>> On 2/04/2019 7:57 pm, Mitchell Krome wrote:
>>> 
>>> 
>>> On 2/04/2019 7:24 pm, David Gwynne wrote:
>>>> 
>>>> 
>>>>> On 2 Apr 2019, at 6:41 pm, Mitchell Krome <mitchellkr...@gmail.com> wrote:
>>>>> 
>>>>> On 2/04/2019 2:08 pm, David Gwynne wrote:
>>>>>> Can you send me the hostname.* files and the output of ifconfig (showing 
>>>>>> all interfaces)?
>>>>>> 
>>>>>> You're using -current now, right?
>>>>>> 
>>>>>> dlg
>>>>>> 
>>>>>>> On 2 Apr 2019, at 08:15, lnel...@nelnet.org wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> First of all the protected domain seems to do the opposite of what I
>>>>>>> need, but it may only appear to be the case because of the strageness
>>>>>>> with broadcast.  When trying to ping (or send any traffic) between
>>>>>>> rtr01 and rtr02 and the two mpw2's are in the same protected domain,
>>>>>>> the arp requests die in the bridge.  The arp never shows up at all on
>>>>>>> the other mpw. If I remove the mpw's from the protected domain, then
>>>>>>> the arp traffic gets through to the other mpw, but it doesn't get sent
>>>>>>> out properly by MPLS.  It's sent out as MPLS broadcast traffic
>>>>>>> originating on the physical ethernet interface but with the right label
>>>>>>> for the pseudowire. Even though the arp request itself is broadcast
>>>>>>> traffic, I would expect it to be encapsulated in a unicast MPLS packet
>>>>>>> which is sent from the MAC of the bridge or the originating router and
>>>>>>> and sent as unicast to the destination router with the pseudowire's
>>>>>>> label.  As it is now, even if the destination router could figure out
>>>>>>> what to do with these MPLS broadcast packets, it would respond to the
>>>>>>> physical interface and not the bridge.
>>>>> 
>>>>> You only need the protected domain if you do a full mesh vpls (I.E.
>>>>> every router has a mpw to every other router). That wasn't the config
>>>>> you showed initially so I don't think you need it in your case.
>>>>> 
>>>>> I am running the following diff to get MPLS to work with GRE as I had a
>>>>> similar ARP issue that was caused by gre_input tagging the packets as
>>>>> MCAST and then mpls_input dropping them. When I looked into it I didn't
>>>>> think that should cause the issue I was seeing for a real interface as
>>>>> ether_input didn't re-add the MCAST flag, but I also don't have a real
>>>>> box to test on. You can give it a go and see if it helps.
>>>> 
>>>> I think you've found the problem. mpls_output replaces if_output though, 
>>>> so for interfaces with mpls enabled on this, this change causes 
>>>> BCAST|MCAST to be cleared for all outgoing packets. ie, it might break 
>>>> things like ipv6 nd on ethernet interfaces.
>>> 
>>> Yeah I had no idea what the impact of that change was, it seemed like a
>>> hack when I wrote it...
>>> 
>>>> 
>>>> What are you running on top of GRE that hit this?
>>> 
>>> I have a vpls over GRE. And I had some weird behaviour where arp was
>>> being dropped only on paths that skipped the outer MPLS label. I.E.
>>> we're directly connected to the next-hop and implicit null means we
>>> never add the LSP label, only the service label. Thanks to tcpdump not
>>> knowing about multicast MPLS over GRE and printing weirdness I worked
>>> out what was going on and tracked it down to this.
>>> 
>>>> 
>>>> For now it might be better to have mpw etc clear the flags before calling 
>>>> mpls_output.
>>> 
>>> That make sense - anything going over an LSP (even if it's one that has
>>> no label due to implicit null) should never be multicast.
>>> 
>>>> 
>>>> Cheers,
>>>> dlg
>>>> 
>> 
>> Give this diff a try. I totally didn't see the initial post showing the
>> remote routers as microtik, so it's entirely possible they drop the
>> multicast MPLS ethertype, where as openbsd just treats it as normal
>> unicast unless you have it over gre. An almost identical diff can be
>> applied to mpe.
>> 
>> diff --git sys/net/if_mpw.c sys/net/if_mpw.c
>> index 4348dff3f..e591e2a77 100644
>> --- sys/net/if_mpw.c
>> +++ sys/net/if_mpw.c
>> @@ -672,6 +672,9 @@ mpw_start(struct ifnet *ifp)
>> 
>>                m0->m_pkthdr.ph_rtableid = ifp->if_rdomain;
>> 
>> +               /* Once we add a label it's a P2P tunnel */
>> +               m0->m_flags &= ~(M_BCAST | M_MCAST);
>> +
>>                mpls_output(ifp0, m0, (struct sockaddr *)&smpls, rt);
>>        }
>> 
>> 
> 
> 
> -- 
> https://keybase.io/nelsonov

Reply via email to