On Wed, May 25, 2016 at 9:40 PM, Flavio Fernandes <fla...@flaviof.com> wrote:
> When responding to icmp echo requests (aka ping) packets, the logical > router should not restrict responses based on the inport. > > Example diagram: > > vm: IP1.1 (subnet1) > logical_router: IP1.2 (subnet1) and IP2.2 (subnet2) > > vm -------[subnet1]------- logical_router -------[subnet2] > <IP1.1> <IP1.2> <IP2.2> > > vm should be able to ping <IP1.2>, even though it is an address > of a subnet that can only be reached through L3 routing. > > Reference to the mailing list thread: > http://openvswitch.org/pipermail/discuss/2016-May/021172.html > > --- > Changes v1->v2: > - Add unit test. > > To be discussed: > > One caveat here is that ttl should be decremented before vm > reaches <IP2.2> and that logic is not invoked until later > in the pipeline. Further work may be necessary in order > to make ip.ttl part of the match in the logical table, so > pings from the non-local subnet are only responded if ttl > 1. > As far as I can tell, the match on logical flows currently does > not handle ip.ttl. > > In short, instead of simply removing the inport match in the icmp rule, > we could add a second rule that would do something like: > > "inport == %s && (ip4.dst == "IP_FMT" || ip4.dst == "IP_FMT") && ... <= > priority 90 > "ip.ttl > 1 && (ip4.dst == "IP_FMT" || ip4.dst == "IP_FMT") && ... <= > priority 90-X > These pings are destined to one of a router's own IP addresses; i.e. local to the router. The router is not forwarding the original ICMP request since an ICMP reply is a separate packet from the ICMP request. The below is an excerpt from RFC 1812 and it also references RFC 791. 4.2.2.9 Time to Live: RFC 791 Section 3.2 . . Note in particular that a router MUST NOT check the TTL of a packet except when forwarding it. . . A router MUST NOT discard a datagram just because it was received with a TTL equal to zero or one; if it is to the router and otherwise valid, the router must attempt to receive it. > > Signed-off-by: Flavio Fernandes <fla...@flaviof.com> > --- > ovn/northd/ovn-northd.c | 8 ++- > tests/ovn.at | 173 > ++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 178 insertions(+), 3 deletions(-) > > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c > index 44e9430..68decbf 100644 > --- a/ovn/northd/ovn-northd.c > +++ b/ovn/northd/ovn-northd.c > @@ -1892,11 +1892,13 @@ build_lrouter_flows(struct hmap *datapaths, struct > hmap *ports, > free(match); > > /* ICMP echo reply. These flows reply to ICMP echo requests > - * received for the router's IP address. */ > + * received for the router's IP address. Since packets only > + * get here as part of the logical router datapath, the inport > + * (i.e. the incoming locally attached net) does not matter. */ > match = xasprintf( > - "inport == %s && (ip4.dst == "IP_FMT" || ip4.dst == "IP_FMT") > && " > + "(ip4.dst == "IP_FMT" || ip4.dst == "IP_FMT") && " > "icmp4.type == 8 && icmp4.code == 0", > - op->json_key, IP_ARGS(op->ip), IP_ARGS(op->bcast)); > + IP_ARGS(op->ip), IP_ARGS(op->bcast)); > char *actions = xasprintf( > "ip4.dst = ip4.src; " > "ip4.src = "IP_FMT"; " > diff --git a/tests/ovn.at b/tests/ovn.at > index e6ac1d7..8cd3677 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -2611,3 +2611,176 @@ OVS_APP_EXIT_AND_WAIT([ovs-vswitchd]) > OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > > AT_CLEANUP > + > +AT_SETUP([ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR]) > +AT_KEYWORDS([router-icmp-reply]) > +AT_SKIP_IF([test $HAVE_PYTHON = no]) > +ovn_start > + > +# Logical network: > +# One LR - R1 has switch ls1 (191.168.1.0/24) connected to it, > +# and has switch ls2 (172.16.1.0/24) connected to it. > + > +ovn-nbctl create Logical_Router name=R1 > + > +ovn-nbctl lswitch-add ls1 > +ovn-nbctl lswitch-add ls2 > + > +# Connect ls1 to R1 > +ovn-nbctl -- --id=@lrp create Logical_Router_port name=ls1 \ > +network=192.168.1.1/24 mac=\"00:00:00:01:02:f1\" -- add Logical_Router > R1 \ > +ports @lrp -- lport-add ls1 rp-ls1 > + > +ovn-nbctl set Logical_port rp-ls1 type=router options:router-port=ls1 \ > +addresses=\"00:00:00:01:02:f1\" > + > +# Connect ls2 to R1 > +ovn-nbctl -- --id=@lrp create Logical_Router_port name=ls2 \ > +network=172.16.1.1/24 mac=\"00:00:00:01:02:f2\" -- add Logical_Router R1 > \ > +ports @lrp -- lport-add ls2 rp-ls2 > + > +ovn-nbctl set Logical_port rp-ls2 type=router options:router-port=ls2 \ > +addresses=\"00:00:00:01:02:f2\" > + > +# Create logical port ls1-lp1 in ls1 > +ovn-nbctl lport-add ls1 ls1-lp1 \ > +-- lport-set-addresses ls1-lp1 "00:00:00:01:02:03 192.168.1.2" > + > +# Create logical port ls2-lp1 in ls2 > +ovn-nbctl lport-add ls2 ls2-lp1 \ > +-- lport-set-addresses ls2-lp1 "00:00:00:01:02:04 172.16.1.2" > + > +# Create one hypervisor and create OVS ports corresponding to logical > ports. > +net_add n1 > + > +sim_add hv1 > +as hv1 > +ovs-vsctl add-br br-phys > +ovn_attach n1 br-phys 192.168.0.1 > +ovs-vsctl -- add-port br-int vif1 -- \ > + set interface vif1 external-ids:iface-id=ls1-lp1 \ > + options:tx_pcap=hv1/vif1-tx.pcap \ > + options:rxq_pcap=hv1/vif1-rx.pcap \ > + ofport-request=1 > + > +ovs-vsctl -- add-port br-int vif2 -- \ > + set interface vif2 external-ids:iface-id=ls2-lp1 \ > + options:tx_pcap=hv1/vif2-tx.pcap \ > + options:rxq_pcap=hv1/vif2-rx.pcap \ > + ofport-request=1 > + > + > +# Allow some time for ovn-northd and ovn-controller to catch up. > +# XXX This should be more systematic. > +sleep 1 > + > + > +ip_to_hex() { > + printf "%02x%02x%02x%02x" "$@" > +} > +trim_zeros() { > + sed 's/\(00\)\{1,\}$//' > +} > +for i in 1 2; do > + : > vif$i.expected > +done > +# test_ipv4_icmp_request INPORT ETH_SRC ETH_DST IPV4_SRC IPV4_DST > IP_CHKSUM ICMP_CHKSUM [EXP_IP_CHKSUM EXP_ICMP_CHKSUM] > +# > +# Causes a packet to be received on INPORT. The packet is an ICMPv4 > +# request with ETH_SRC, ETH_DST, IPV4_SRC and IPV4_DST as specified. If > EXP_IP_CHKSUM > +# is provided, then it should be the ip checksum of the packet responded; > +# otherwise no reply is expected. > +# In the absence of an ip checksum calculation helpers, we will rely on > caller to provide the checksums for the ip > +# and the icmp headers. > +# XXX This should be more systematic. > +# > +# INPORT is an lport number, e.g. 11 for vif11. > +# ETH_SRC and ETH_DST are each 12 hex digits. > +# IPV4_SRC and IPV4_DST are each 8 hex digits. > +# IP_CHSUM and ICMP_CHKSUM are each 4 hex digits. > +# EXP_IP_CHSUM and EXP_ICMP_CHKSUM are each 4 hex digits. > +test_ipv4_icmp_request() { > + local inport=$1 eth_src=$2 eth_dst=$3 ipv4_src=$4 ipv4_dst=$5 > ip_chksum=$6 icmp_chksum=$7 > + local exp_ip_chksum=$8 exp_icmp_chksum=$9 > + shift; shift; shift; shift; shift; shift; shift > + shift; shift > + > + local ip_ttl=0a > + local icmp_id=5fbf > + local icmp_seq=0001 > + local icmp_data=$(seq 1 56 | xargs printf "%02x") > + local icmp_type_code_request=0800 > + local > icmp_payload=${icmp_type_code_request}${icmp_chksum}${icmp_id}${icmp_seq}${icmp_data} > + local > packet=${eth_dst}${eth_src}08004500005400004000${ip_ttl}01${ip_chksum}${ipv4_src}${ipv4_dst}${icmp_payload} > + > + as hv1 ovs-appctl netdev-dummy/receive vif$inport $packet > + if test X$exp_ip_chksum != X; then > + # Expect to receive the reply, if any. In same port where packet > was sent. > + # Note: src and dst are expected to be reversed. > + local icmp_type_code_response=0000 > + local reply_icmp_ttl=fe > + local > reply_icmp_payload=${icmp_type_code_response}${exp_icmp_chksum}${icmp_id}${icmp_seq}${icmp_data} > + local > reply=${eth_src}${eth_dst}08004500005400004000${reply_icmp_ttl}01${exp_ip_chksum}${ipv4_dst}${ipv4_src}${reply_icmp_payload} > + echo $reply >> vif$inport.expected > + fi > +} > + > +# send ping packet to router's ip addresses, from each of the 2 logical > ports. > +rtr_l1_ip=$(ip_to_hex 192 168 1 1) > +rtr_l2_ip=$(ip_to_hex 172 16 1 1) > +l1_ip=$(ip_to_hex 192 168 1 2) > +l2_ip=$(ip_to_hex 172 16 1 2) > + > +# ping router ip address that is on same subnet as the logical port > +test_ipv4_icmp_request 1 000000010203 0000000102f1 $l1_ip $rtr_l1_ip 0000 > 8510 0bff 8d10 > +test_ipv4_icmp_request 2 000000010204 0000000102f2 $l2_ip $rtr_l2_ip 0000 > 8510 0bff 8d10 > + > +# ping router ip address that is on the other side of the logical ports > +test_ipv4_icmp_request 1 000000010203 0000000102f1 $l1_ip $rtr_l2_ip 0000 > 8510 0bff 8d10 > +test_ipv4_icmp_request 2 000000010204 0000000102f2 $l2_ip $rtr_l1_ip 0000 > 8510 0bff 8d10 > + > +echo "---------NB dump-----" > +ovn-nbctl show > +echo "---------------------" > +ovn-nbctl list logical_router > +echo "---------------------" > +ovn-nbctl list logical_router_port > +echo "---------------------" > + > +echo "---------SB dump-----" > +ovn-sbctl list datapath_binding > +echo "---------------------" > +ovn-sbctl list logical_flow > +echo "---------------------" > + > +echo "------ hv1 dump ----------" > +as hv1 ovs-ofctl dump-flows br-int > + > +# check received packets against expected > +for inport in 1 2; do > + file=hv1/vif${inport}-tx.pcap > + echo $file > + $PYTHON "$top_srcdir/utilities/ovs-pcap.in" $file | trim_zeros > > received.packets > + cat vif$inport.expected | trim_zeros > expout > + AT_CHECK([cat received.packets], [0], [expout]) > +done > + > +as hv1 > +OVS_APP_EXIT_AND_WAIT([ovn-controller]) > +OVS_APP_EXIT_AND_WAIT([ovs-vswitchd]) > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as ovn-sb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as ovn-nb > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +as northd > +OVS_APP_EXIT_AND_WAIT([ovn-northd]) > + > +as main > +OVS_APP_EXIT_AND_WAIT([ovs-vswitchd]) > +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) > + > +AT_CLEANUP > -- > 1.9.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev