On Mon, Nov 24, 2014 at 9:49 AM, Stefano Salsano <stefano.sals...@uniroma2.it> wrote: > Il 21/11/2014 11:04, Stefano Salsano ha scritto: >> >> Il 12/11/2014 01:19, Ben Pfaff ha scritto: >>> >>> On Tue, Nov 11, 2014 at 11:25:34PM +0100, Stefano Salsano wrote: >>>> >>>> we need to POP out two MPLS labels in an egress node and then send >>>> out the packet as plain IP packet, but apparently this is not >>>> working in Open vSwitch v2.390 operating in kernel mode >> >> > > [...] >> >> >> we are now trying to add the correct check in compose_mpls_pop_action: >> >> } else if (n >= ctx->xbridge->max_mpls_depth) { >> COVERAGE_INC(xlate_actions_mpls_overflow); >> ctx->xout->slow |= SLOW_ACTION; >> } >> >> and we will let you know the results > > > Hi all, > > we added the if condition that forwards the packet to user space (slow path) > in case of POP with multiple MPLS labels (see patch attached) > > the result is that the packet is correctly forwarded, but there is a problem > in updating the stats > > unfortunately we do not have a clear picture of the architecture, but we see > that both the "handler" thread and the "revalidator" thread process the flow > > the "revalidator" thread returns this error message: "resubmit actions > recursed over 64 times" > > it seems that the both the handler and revalidator thread receive as input > this flow description: > > mpls,in_port=4,vlan_tci=0x0000,dl_src=5a:93:39:89:21:cf,dl_dst=22:4e:63:a1:65:37, > mpls_label=524292,mpls_tc=0,mpls_ttl=64,mpls_bos=0,mpls_lse1=0,mpls_lse2=0 > > which according to us is wrong, as we see two inner mpls labels set set to 0 > > the correct flow in our opinion should be like this (that we have taken from > a debug print in the handler thread) > mpls,in_port=1,vlan_tci=0x0000,dl_src=5a:93:39:89:21:cf,dl_dst=22:4e:63:a1:65:37, > mpls_label=524292,mpls_tc=0,mpls_ttl=64,mpls_bos=0,mpls_lse1=1073742144 > > it seems that the odp_flow_key_to_flow__ function > (https://github.com/openvswitch/ovs/blob/master/lib/odp-util.c#L3590) is not > able to retrieve the correct flow both in the handler and in the revalidator > thread (result = ODP_FIT_TOO_LITTLE) > > in the handler thread the function flow_extract is called afterwards > (https://github.com/openvswitch/ovs/blob/master/ofproto/ofproto-dpif-upcall.c#L634), > setting the correct flow information and allowing the forwarding of the > packet, while in the validator thread this does not happen and the validator > is stuck with the wrong flow information > > do you have any hint on how we can fix the problem ?? >
Multiple MPLS PUSH and pop should be handled without userspace action. multiple MPLS PUSH can be a single action, kernel datapath support this operation. Multiple POP can be handled by recirculating after every MPLS POP action. If you are going to investigate it more ofpact_needs_recirculation_after_mpls() should be looked at. Fix it to return true for multiple MPLS pop. > thank you > Stefano & Pier Luigi > > > > >> >> >> ***************************************************************** >> Detailed Report of the problem sent on 18/11/2014 >> >> 1) recirculation does not work properly when there are >> two or more mpls_pop, let us compare what happens in our system when >> using the kernel datapath (not working) or the userspace netdev datapath >> (working) >> >> we are using this reference topology >> h1---peo1----peo2---h2 >> h1 source node, peo1 ingress node, peo2 egress node, h2 dest node >> >> in peo1, we report the rules shown with ovs-ofctl (obviously they are >> the same with kernel and netdev datapath) >> >> peo1: >> -> cookie=0x0, duration=185.672s, table=0, n_packets=178, n_bytes=17444, >> ip,in_port=2 >> >> actions=push_mpls:0x8847,set_field:262144->mpls_label,push_mpls:0x8847,set_field:524292->mpls_label,goto_table:1 >> >> -> cookie=0x0, duration=185.617s, table=1, n_packets=178, n_bytes=17444, >> mpls,in_port=2,mpls_label=524292 actions=output:1 >> >> recirculation causes the installation of flows that are not visible with >> the ovs-ofctl command, so we use ovs-appctl >> (sudo ovs-appctl dpif/dump-flows peo1) >> >> the result is different with kernel and netdev datapath: >> >> kernel datapath: >> -> recirc_id(0),in_port(2),eth_type(0x0800), >> ipv4(tos=0/0xfc,ttl=64,frag=no), packets:14, bytes:1372, used:0.692s, >> actions:userspace(pid=4294959673,slow_path(action)) >> >> netdev datapath: >> -> recirc_id(0),in_port(2),eth_type(0x0800), >> ipv4(tos=0/0xfc,ttl=64,frag=no), packets:17, bytes:1666, used:0.664s, >> >> actions:push_mpls(label=262144,tc=0,ttl=64,bos=1,eth_type=0x8847),push_mpls(label=524292,tc=0,ttl=64,bos=0,eth_type=0x8847),1 >> >> >> we note that with kernel datapath, there is a fallback to userspace >> processing that is triggered when multiple PUSH operation are requested >> >> in peo2, we report the rules shown with ovs-ofctl (obviously they are >> the same with kernel and netdev datapath) >> >> peo2: >> -> cookie=0x0, duration=621.359s, table=0, n_packets=284, n_bytes=30104, >> mpls,in_port=1 actions=goto_table:1 >> -> cookie=0x0, duration=621.299s, table=1, n_packets=284, >> n_bytes=30104, mpls,in_port=1,mpls_label=524292 >> actions=pop_mpls:0x8847,resubmit(,1) >> -> cookie=0x0, duration=621.184s, table=1, n_packets=284, n_bytes=30104, >> mpls,in_port=1,mpls_label=262144,mpls_bos=1 >> actions=pop_mpls:0x0800,output:2 >> >> recirculation causes the installation of flows that are not visible with >> the ovs-ofctl command, so we use ovs-appctl >> (sudo ovs-appctl dpif/dump-flows peo1) >> >> the result is different with kernel and netdev datapath: >> >> kernel datapath: >> NO RULES ARE SHOWN !!!! >> >> netdev datapath: >> -> recirc_id(0),in_port(4),eth_type(0x8847), >> mpls(lse0=0x80004040/0xfffff100, lse1=0x40000140/0xffffffff), >> packets:5, bytes:530, used:0.720s, >> actions:pop_mpls(eth_type=0x8847),pop_mpls(eth_type=0x800),5 >> >> peo2 is the egress node where we execute the double pop, it seems that >> something is preventing the installation of the "recirculation" flows >> when using the kernel datapath >> >> We have also checked the ovs-vswitchd log and when we activate the >> kernel datapath, we see these warning messages: >> >> -> |WARN|system@ovs-system: failed to put[create] (Invalid argument) >> >> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(4),skb_mark(0/0),eth(src=e6:da:61:3a:f5:81/00:00:00:00:00:00,dst=0a:a5:58:7d:1c:12/00:00:00:00:00:00),eth_type(0x8847),mpls(label=524292/0xfffff,tc=0/0,ttl=64/0x0,bos=0/1), >> >> actions:pop_mpls(eth_type=0x8847),pop_mpls(eth_type=0x800),5 >> -> |WARN|system@ovs-system: execute pop_mpls(eth_type=0x8847), >> pop_mpls(eth_type=0x800),5 failed >> (Invalid argument) on packet mpls,in_port=0,vlan_tci=0x0000, >> dl_src=e6:da:61:3a:f5:81,dl_dst=0a:a5:58:7d:1c:12,mpls_label=524292, >> mpls_tc=0,mpls_ttl=64,mpls_bos=0,mpls_lse1=1073742144 >> >> These messages explain also the absence of the "recirculation flows", >> even if in our mininet emulation scenario there is a single ovs-vswitchd >> instance, we can refer these messages to peo2 from the "output port" >> >> We also noted at the beginning of the vswitchd-log these messages: >> >> When using the Kernel datapath: >> 2014-11-17T19:09:23.698Z|00029|ofproto_dpif|INFO|system@ovs-system: >> MPLS label stack length probed as 1 >> >> When using the userspace netdev datapath: >> 2014-11-17T19:17:37.697Z|00069|ofproto_dpif|INFO|netdev@ovs-netdev: >> MPLS label stack length probed as 3 >> >> We have checked in the ovs code, this print is generated by the >> function static size_t check_max_mpls_depth(struct dpif_backer >> *backer), this function tries to probe the stack length... >> >> we would like to understand what is behind this limitation... why in our >> system the stack length for the kernel datapath is limited to 1 ? >> does the stack length depend on some features provided by the >> system/kernel? is there some kernel version without this limitation? >> >> assuming that we have this limitation, our understanding is that this >> control does not prevent to PUSH three mpls label because the packet is >> sent to the userspace ovs-switchd as a fallback (and the packet is >> correctly created and forwarded)... is this correct? >> >> if this is the case, it seems that the same fallback is not activated in >> case of multiple POP operations (>=2 POPs) the system tries to execute >> the multiple POPs in the kernel but it generates the error messages >> listed above and no packet is forwarded >> >> >> End Detailed Report of the problem >> ***************************************************************** >> >> >> >> >> >> >> >> >>>> >>>> We have replicated the problem in a simple mininet setup (script >>>> attached) with four nodes connected as follows: >>>> >>>> h1---peo1----peo2---h2 >>>> h1 source node, peo1 ingress node, peo2 egress node, h2 dest node >>>> >>>> - h1 pings h2 (automatic static arp table are used, so there is no ARP) >>>> - peo1 pushes two mpls labels in the packet >>>> - peo2 is supposed to pop the two labels and forward the packet as IP >>>> >>>> we have a rule to match the outer label, POP it and resubmit as MPLS >>>> packet: >>>> actions=pop_mpls:0x8847,resubmit(,1) >>>> >>>> then we have a rule to match the inner label, POP it and output on >>>> a port as an IP packet: >>>> actions=pop_mpls:0x0800,output:11 >>>> >>>> both rules match (and the first pop works), but then the packet >>>> does not exit from the port 11 >>>> >>>> we have also tried without success to recirculate the packet >>>> implicitly after the first pop, replacing the first action with: >>>> actions=pop_mpls:0x8847 >>>> >>>> Can anyone help us to identify the problem ? Are we doing something >>>> wrong or is it a bug? >>>> >>>> The mininet script to setup the topology and create the rules in >>>> order to reproduce the problem is attached. >>>> >>>> Some details are reported hereafter for the scenario with explicit >>>> resubmit. >>>> >>>> thank you in advance for your help... >>>> >>>> ciao >>>> Stefano >>>> >>>> - peo1 flow table >>>> OFPST_FLOW reply (OF1.3) (xid=0x2): >>>> duration=275.553s, table=0, n_packets=58, n_bytes=5684,ip,in_port=2 >>>> >>>> actions=push_mpls:0x8847,set_field:262144->mpls_label,push_mpls:0x8847,set_field:524292->mpls_label,goto_table:1 >>>> >>>> >>>> >>>> duration=275.493s,table=1,n_packets=58,n_bytes=5684,mpls,in_port=2,mpls_label=524292 >>>> >>>> actions=output:1 >>>> >>>> - peo2 flow table >>>> OFPST_FLOW reply (OF1.3) (xid=0x2): >>>> >>>> duration=317.678s,table=0,n_packets=58,n_bytes=6148,mpls,in_port=1 >>>> actions=goto_table:1 >>>> >>>> >>>> duration=317.618s,table=1,n_packets=58,n_bytes=6148,mpls,in_port=1,mpls_label=524292 >>>> >>>> actions=pop_mpls:0x8847,resubmit(,1) >>>> >>>> >>>> duration=317.559s,table=1,n_packets=58,n_bytes=6148,mpls,in_port=1,mpls_label=262144,mpls_bos=1 >>>> >>>> actions=pop_mpls:0x0800,output:2 >>>> >>>> NB If the interfaces created by Mininet in the root namespace happen >>>> to be down, stop the Network Manager and then restart the experiment >>>> -- >>>> ******************************************************************* >>>> Stefano Salsano >>>> Professore Associato >>>> Dipartimento Ingegneria Elettronica >>>> Universita' di Roma Tor Vergata >>>> Via del Politecnico, 1 - 00133 Roma - ITALY >>>> >>>> http://netgroup.uniroma2.it/Stefano_Salsano/ >>>> >>>> E-mail : stefano.sals...@uniroma2.it >>>> Cell. : +39 320 4307310 >>>> Office : (Tel.) +39 06 72597770 (Fax.) +39 06 72597435 >>>> ******************************************************************* >>>> >>> >>>> #!/usr/bin/python >>>> >>>> import subprocess >>>> >>>> from mininet.net import Mininet >>>> from mininet.node import Host, OVSKernelSwitch, Node >>>> from mininet.cli import CLI >>>> from mininet.log import lg, info >>>> >>>> def setup(): >>>> >>>> lg.setLogLevel('info') >>>> >>>> net = Mininet(switch=OVSKernelSwitch, build=False, >>>> autoStaticArp=True ) >>>> >>>> host1 = net.addHost("h1") >>>> host2 = net.addHost("h2") >>>> >>>> peo1 = net.addSwitch("peo1") >>>> peo2 = net.addSwitch("peo2") >>>> >>>> net.addLink(peo1, peo2) # 1 | 1 >>>> net.addLink(host1, peo1)# N/A | 2 >>>> net.addLink(host2, peo2)# N/A | 2 >>>> >>>> net.start() >>>> >>>> #PEO1 Configuration >>>> root = Node( 'root', inNamespace=False ) >>>> root.cmd('ovs-vsctl --no-wait set bridge %s protocols=OpenFlow13' >>>> %(peo1.name)) >>>> root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s >>>> >>>> "table=0,hard_timeout=0,priority=32768,in_port=2,eth_type=0x800,actions=push_mpls:0x8847,set_field:262144->mpls_label,push_mpls:0x8847,set_field:524292->mpls_label,goto_table:1"' >>>> %(peo1.name)) >>>> root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s >>>> >>>> "table=1,hard_timeout=0,priority=32768,in_port=2,eth_type=0x8847,mpls_label=524292,action=output:1"' >>>> %(peo1.name)) >>>> >>>> #PEO2 Configuration >>>> root.cmd('ovs-vsctl --no-wait set bridge %s protocols=OpenFlow13' >>>> %(peo2.name)) >>>> root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s >>>> >>>> "table=0,hard_timeout=0,priority=32768,in_port=1,eth_type=0x8847,actions=goto_table=1"' >>>> %(peo2.name)) >>>> root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s >>>> >>>> "table=1,hard_timeout=0,priority=32768,in_port=1,eth_type=0x8847,mpls_label=524292,action=pop_mpls:0x8847,resubmit(,1)"' >>>> %(peo2.name)) >>>> root.cmd('ovs-ofctl -O OpenFlow13 add-flow %s >>>> >>>> "table=1,hard_timeout=0,priority=32768,in_port=1,eth_type=0x8847,mpls_label=262144,mpls_bos=1,action=pop_mpls:0x800,output:2"' >>>> %(peo2.name)) >>>> >>>> CLI(net) >>>> net.stop() >>>> subprocess.call(["sudo", "mn", "-c"], stdout=None, stderr=None) >>>> >>>> >>>> if __name__ == '__main__': >>>> setup() >>>> >>>> >>>> >>>> >>>> >>>> >>> >>>> _______________________________________________ >>>> discuss mailing list >>>> discuss@openvswitch.org >>>> http://openvswitch.org/mailman/listinfo/discuss >>> >>> >>> >> >> > > > -- > ******************************************************************* > Stefano Salsano > Professore Associato > Dipartimento Ingegneria Elettronica > Universita' di Roma Tor Vergata > Via del Politecnico, 1 - 00133 Roma - ITALY > > http://netgroup.uniroma2.it/Stefano_Salsano/ > > E-mail : stefano.sals...@uniroma2.it > Cell. : +39 320 4307310 > Office : (Tel.) +39 06 72597770 (Fax.) +39 06 72597435 > ******************************************************************* _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss