On 12/01/2011 01:11 PM, David Erickson wrote:
On 12/1/2011 12:05 PM, Ben Pfaff wrote:
On Thu, Dec 01, 2011 at 12:02:46PM -0800, David Erickson wrote:
On 12/1/2011 12:01 PM, Ben Pfaff wrote:
On Wed, Nov 30, 2011 at 06:18:15PM -0800, David Erickson wrote:
I think I've run into another semi-related issue with inband. I
recently changed the fail mode of my OVS instances from standalone
to secure because I wanted them to continue using the rules that had
been set in the event the controller died or needed restarting.
This had the unfortunate side effect that if the controller is down
long enough then the box can lose its DHCP lease, and be unable to
get a new one because DHCP requests are trying to be sent to the
controller, but the switch isn't able to connect to the controller
without an address, ie:
Nov 30 21:04:52 localhost dhclient: DHCPDISCOVER on xenbr0 to
255.255.255.255 port 67 interval 4
Nov 30 21:04:56 localhost dhclient: No DHCPOFFERS received.
Nov 30 21:04:56 localhost dhclient: No working leases in persistent
database - sleeping.
Nov 30 21:04:58 localhost ovs-vswitchd:
789878|stream_tcp|ERR|tcp:192.168.1.11:6633: connect: Network is
unreachable
What is the behavior of the inband rules when the switch is in
secure mode and has lost connection to the controller?
No different from any other time.
It seems to me like they are being ignored, or at least the DHCP
rule doesn't seem to be working.
Please investigate further. I would start by finding out whether the
DHCP requests are making it out on the wire, then if they are, whether
DHCP replies are visible coming back across the wire.
Ya the requests aren't making it onto the wire.
Please capture the kernel flow that matches the request with
"ovs-dpctl dump-flows", then feed that flow back into "ofproto/trace"
to see what OVS is actually doing with it.
So in the base case where dump-flows returns no flows at all, the host
has lost its dhcp lease it had previously, and the controller had gone
unreachable so the switch moved to fail secure mode:
ovs-appctl ofproto/trace xenbr0 0 65534
ffffffffffff00c09f9ffed8080045100148000000001011a99600000000ffffffff004400430134819801010600c4bea10a000000000000000000000000000000000000000000c09f9ffed80000000000000000000000000000000000000000
Packet: 00:c0:9f:9f:fe:d8 > Broadcast, ethertype IPv4 (0x0800), length
96: truncated-ip - 246 bytes missing! 0.0.0.0.bootpc >
255.255.255.255.bootps: BOOTP/DHCP, Request from 00:c0:9f:9f:fe:d8,
length: 300
Flow: tunnel0:in_port0000:tci(0) mac00:c0:9f:9f:fe:d8->ff:ff:ff:ff:ff:ff
type0800 proto17 tos16 ip0.0.0.0->255.255.255.255 port68->67
Rule: table=0 cookie=0
priority=180000,udp,in_port=0,dl_src=00:c0:9f:9f:fe:d8,tp_src=68,tp_dst=67
OpenFlow actions=NORMAL
Final flow: unchanged
Datapath actions: drop
Any ideas?
Here are some short instructions on how to reproduce:
-Connect OVS to a controller with fail mode set to standalone
-Set DHCP server to hand out a very short term lease to the host
XenServer machine (say 180 seconds)
-Release/renew the lease on the host so it is on the new lease time
-Set OVS switch fail mode to secure
-Add iptables rule on your controller machine to block the XS host's ip
-OVS should lose connection to controller
-Ensure dump-flows shows no flows, particularly none that would handle
the DHCP request/response
-DHCP lease should expire
-Badness should ensue
-David
_______________________________________________
discuss mailing list
discuss@openvswitch.org
http://openvswitch.org/mailman/listinfo/discuss