On Wed, Apr 6, 2016 at 5:33 PM, Nicholas Bastin <nick.bas...@gmail.com>
wrote:

> On Wed, Apr 6, 2016 at 5:16 PM, Ryan Izard <riz...@g.clemson.edu> wrote:
>
>> I have a very simple topology as follows:
>>
>> network----[Dell S4810]-24---link---1-[host w/OVS br0]-LOCAL
>>
>> The host with OVS has IP 192.168.1.3/24 with a route into the br0 (i.e.
>> LOCAL) interface.
>>
>
> I don't really understand what this means.  What ports are on br0 and what
> interfaces have IP addresses?
>

br0 has port 1 (eth1) and LOCAL (br0)

>
>
>> We try to ping another host on the network from host 192.168.1.3, but the
>> ping confuses our controller's MAC learning algorithm due to OVS
>> mishandling ARP packets. Here are some observations:
>>
>
> Where are you issuing the ping from, the command line of the host with
> OVS?  What do your local routing and arp tables look like?
>

On the host itself running the OVS bridge, we have a route for 192.168.1/24
into br0. We are running ping 192.168.1.4 from the host.

>
>
>> -- using OVS 2.3.1 and has been running stably since release until
>> recently (no known changes)
>>
>
> Do ovs-vsctl commands hang?  I doubt it in your case, but we've had some
> lockups on vswitchd that forced us to upgrade all the VTS hardware to 2.5.0.
>

Nope. Nothing hangs.

>
>
>> -- there is only 1 flow installed. It is a single, zero-priority,
>> fully-wildcarded table-miss flow w/output=controller
>>
>
> Well, not really.. :-)  Try:
>
> sudo ovs-appctl bridge/dump-flows br0
>

Good idea :-) Did not realize you could dive that deep into into the
forwarding tables. There are some ARP flows with NORMAL output actions.
These definitely look suspicious, especially the one matching our host as
src MAC, ethertype=ARP, and opcode=request...

>
> There's some special handling for ARP for in-band control that is set in
> very-high-priority hidden flows in a late pipeline table.  Make sure you're
> not hitting those flows.
>

All these hidden ARP flows are all very high (18000+) priority flows. Why
would these be here if we are operating in secure mode? More puzzling is
that we have probably 50 OVS bridges across all our disjoint network
topologies and disjoint control planes that this problem happened to
seemingly overnight.

>
>
>> -- the Dell switch gets all the ARPs and sends them as packet-ins to our
>> controller, so they are being forwarded by the OVS somehow
>>
>
> I still don't quite understand your topology graph, but sourcing packets
> from a host connected to an OVS bridge that it is itself hosting can get
> problematic without some namespacing.
>

Will look into this. Should the ideal setup be a veth pair -- one end
attached to the bridge and the other to a different netns?

>
I hope this is a little better. Topology is:

[LAN with other hosts, one is 192.168.1.4]
        |
        |
[Dell-S4810--port24]----[eth1(1)--br0(LOCAL)]

IP 192.168.1.3/24 is assigned to br0. ARP packets sent to br0 by the host
running the OVS br0 bridge arrive on LOCAL. From there, we'd expect a
packet-in, which obviously now is being stopped by the hidden matching ARP
flow. Instead, OVS is forwarding ARP for us to port 1, which goes out eth1
to our next hop switch.

>
>

>
>> -- tried installing explicit
>> priority=1,in_port=LOCAL,dl_type=0x806,actions=output:CONTROLLER flow; this
>> does not match the ARP packets. They are still forwarded thru OVS
>> -- there are no other routes on the host that could match the packets and
>> circumvent OVS
>>
>> My inclination is that OVS is forwarding all ARP packets "under the
>> table" and only sending L3+ and unknown ethertypes (LLDP perhaps?) to the
>> controller.
>>
>
> All I can guess right now is that you're hitting the in-band ARP matches,
> although I'm not sure why you've never had this problem before.  More
> information about your topology and bridge configuration might reveal
> something more useful.
>

Yes, we are hitting the in-band ARP matches, but again, as I mentioned
above, we've been running these OVS (of different versions) for a very long
time now using LOCAL as a way for our hosts running OVS to attach to the
data plane. Almost every OVS bridge we have running (on our own machines,
in CloudLab, in GENI) has gotten into this state at seemingly the same
time. They're all part of different networks and controllers and different
locations around the country.

>
> --
> Nick
>
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss

Reply via email to