On 5/9/25 2:14 PM, Dumitru Ceara wrote: > On 5/9/25 5:38 AM, Trọng Đạt Trần wrote: >> Hi Dimitru, >> > > Hi Oscar, > > >> Thank you for pointing that out. >> >> To clarify: the terms “inbound” and “outbound” in my previous message >> were used from the *VM’s perspective*. >> >> >> Topology: >> >> |vm_a ---- network1 ---- router ---- network2 ---- vm_b | >> >> >> ACLs: >> >> * >> >> *ACL A*: allow-related VMs to *send* IPv4 traffic (|direction=from- >> lport|) >> >> * >> >> *ACL B*: allow-related VMs to *receive* ICMP traffic (|direction=to- >> lport|) >> >> I’ve attached both the *Northbound and Southbound database dumps* to >> ensure the full context is available. >> > > Thanks for the info, I tried locally with a simplified setup where I > emulate your topology: > > switch c9c171ef-849c-436d-b3f9-73d83b9c4e5d (ls) > port vm2 > addresses: ["00:00:00:00:00:02"] > port vm1 > addresses: ["00:00:00:00:00:01"] > > Those two VIFs are in a port group: > > # ovn-nbctl list port_group > _uuid : 7e7a96b9-e708-4eea-b380-018314f2435c > acls : [1d0e7b71-ff03-4c78-ace4-2448bf237e11, > 7cb023e9-fee5-4576-a67d-ce1f5d98805b] > external_ids : {} > name : pg > ports : [d991baa6-21b0-4d46-a15d-71b9e8d6708d, > f2c5679c-d891-4d34-8402-8bc2047fba61] > > With two ACLs applied: > # ovn-nbctl acl-list pg > from-lport 100 (inport==@pg && ip4) allow-related > to-lport 200 (outport==@pg && ip4 && icmp4) allow-related > > Both ACLs have only sampling for established traffic (sample_est) set: > # ovn-nbctl list acl > _uuid : 1d0e7b71-ff03-4c78-ace4-2448bf237e11 > action : allow-related > direction : from-lport > match : "inport==@pg && ip4" > priority : 100 > sample_est : 23153fae-0a73-4f86-bdf2-137e76647da8 > sample_new : [] > > _uuid : 7cb023e9-fee5-4576-a67d-ce1f5d98805b > action : allow-related > direction : to-lport > match : "outport==@pg && ip4 && icmp4" > priority : 200 > sample_est : 42391c82-23d2-4f2b-a7b9-88afaa68282c > sample_new : [] > > # ovn-nbctl list sample > _uuid : 23153fae-0a73-4f86-bdf2-137e76647da8 > collectors : [82540855-dcd4-44e4-8354-e08a972500cd] > metadata : 2000000 > > _uuid : 42391c82-23d2-4f2b-a7b9-88afaa68282c > collectors : [82540855-dcd4-44e4-8354-e08a972500cd] > metadata : 1000000 > > Then I send a single ICMP echo packet from vm2 towards vm1. The ICMP > echo hits both ACLs but because it's the packet initiating the session > doesn't generate a sample (sample_new is not set in the ACLs). Instead > 2 conntrack entries are created for the ICMP session: > > - one in the CT zone of vm2 - here the from-lport ACL is hit so the > sample_est metadata of the from-lport ACL (200000) is stored along in > the conntrack state > > - one in the CT zone of vm1 - here the tolport ACL is hit so the > sample_est metadata of the to-lport ACL (100000) is stored along in the > conntrack state > > The ICMP echo packet reaches vm1 which replies with ICMP ECHO Reply. > > For the reply the CT zone of vm1 is first checked, we match the existing > conntrack entry (its state moves to "established") and a sample for the > stored metadata, 100000, is generated. Then, in the egress pipeline, > the CT zone of vm2 is checked, we match the other existing conntrack > entry (its state also moves to "established") and a sample for the > stored metadata, 200000, is generated. > > This seems correct to me. Stats also seem to confirm that: > # ip netns exec vm2 ping 42.42.42.2 -c1 > PING 42.42.42.2 (42.42.42.2) 56(84) bytes of data. > 64 bytes from 42.42.42.2: icmp_seq=1 ttl=64 time=1.46 ms > > --- 42.42.42.2 ping statistics --- > 1 packets transmitted, 1 received, 0% packet loss, time 0ms > rtt min/avg/max/mdev = 1.455/1.455/1.455/0.000 ms > > # ovs-ofctl dump-ipfix-flow br-int > NXST_IPFIX_FLOW reply (xid=0x2): 1 ids > id 2: flows=2, current flows=0, sampled pkts=2, ipv4 ok=2, ipv6 > ok=0, tx pkts=11 > pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=11 > > But then, when I increase the number of packets things become more > interesting. ICMP echos also generate samples. And while that might > seem like a bug, it's not. :) > > When ping sends multiple packets for a single invocation it uses the > same ICMP ID and just increments the ICMP seq, e.g.: > > 14:07:41.986618 00:00:00:00:00:02 > 00:00:00:00:00:01, ethertype IPv4 > (0x0800), length 98: (tos 0x0, ttl 64, id 58647, offset 0, flags [DF], > proto ICMP (1), length 84) > 42.42.42.3 > 42.42.42.2: ICMP echo request, id 35717, seq 1, length 64 > > 14:07:42.988077 00:00:00:00:00:02 > 00:00:00:00:00:01, ethertype IPv4 > (0x0800), length 98: (tos 0x0, ttl 64, id 59085, offset 0, flags [DF], > proto ICMP (1), length 84) > 42.42.42.3 > 42.42.42.2: ICMP echo request, id 35717, seq 2, length 64 > > But conntrack doesn't use the ICMP ID in the key for the session it > installs:
Sorry about the typo, I meant to say "conntrack doesn't use the ICMP SEQ in the key for the session it installs, it only uses the ICMP ID". > > # ovs-appctl dpctl/dump-conntrack | grep 42.42.42 > icmp,orig=(src=42.42.42.3,dst=42.42.42.2,id=35628,type=8,code=0),reply=(src=42.42.42.2,dst=42.42.42.3,id=35628,type=0,code=0),zone=4,mark=131104,labels=0xf4240000000000000000000000000 > icmp,orig=(src=42.42.42.3,dst=42.42.42.2,id=35628,type=8,code=0),reply=(src=42.42.42.2,dst=42.42.42.3,id=35628,type=0,code=0),zone=6,mark=131072,labels=0x1e8480000000000000000000000000 > > So, subsequent ICMP requests will match on these two existing > established entries and (because sampling_est) is configured samples are > generated for them too. > > That's also visible in the datapath flows that forward packets in the > "original" direction (ICMP ECHOs in our case): > > # ovs-appctl dpctl/dump-flows | grep sample | grep '\-rpl' > recirc_id(0x29),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20000/0xff0071),ct_label(0x1e8480000000000000000000000000),eth(src=00:00:00:00:00:02,dst=00:00:00:00:00:01),eth_type(0x0800),ipv4(proto=1,frag=no), > packets:8, bytes:784, used:2.342s, > actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554434,obs_point_id=2000000,output_port=4294967295)),ct(commit,zone=6,mark=0x20000/0xff0071,label=0x1e8480000000000000000000000000/0xffffffffffff00000000000000000000,nat(src)),ct(zone=4),recirc(0x2a) > > recirc_id(0x2a),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20020/0xff0071),ct_label(0xf4240000000000000000000000000),eth(src=00:00:00:00:00:02,dst=00:00:00:00:00:00/ff:ff:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no), > packets:8, bytes:784, used:2.342s, > actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554434,obs_point_id=1000000,output_port=4294967295)),ct(commit,zone=4,mark=0x20020/0xff0071,label=0xf4240000000000000000000000000/0xffffffffffff00000000000000000000,nat(src)),1 > > So, for a less complicated test, maybe you should try with UDP/TCP instead. > > I hope that clarifies your doubts. > > Best regards, > Dumitru > >> Best regards, >> >> Oscar >> >> >> On Thu, May 8, 2025 at 8:11 PM Dumitru Ceara <dce...@redhat.com >> <mailto:dce...@redhat.com>> wrote: >> >> Hi Oscar, >> >> On 5/6/25 12:31 PM, Trọng Đạt Trần wrote: >> > As requested, I’ve attached additional tracing information related to >> > the sampling duplication issue. >> > >> > * >> > >> > The file |ofproto_trace.log| contains the full output of |ofproto/ >> > trace| commands. >> > >> > * >> > >> > The archive |ovn-detrace.tar.gz| includes six separate files, each >> > corresponding to an |ovn-detrace| output for a flow I believe is >> > involved in the duplicated sampling. >> > >> > Since I’m not fully confident in how to use |--ct-next option|, I’ve >> > included traces for all six related flows to ensure completeness. >> > >> > Please let me know if you need further details, or if I should re-run >> > any commands with additional options. >> > >> >> This seems fairly easy to reproduce locally for investigation; I didn't >> try yet though. However, would you mind sharing your OVN NB database >> file (I'm assuming this is a test environment)? >> >> I would like to make sure we don't have any misunderstanding because the >> terms you use below in your ACL description (e.g., "outbound"/"inbound") >> are not standard terms. Having the actual ACL (and the rest of the NB) >> contents will make it easier to debug. >> >> Thanks, >> Dumitru >> >> > Best regards, >> > >> > *Oscar* >> > >> > >> > On Tue, May 6, 2025 at 4:15 PM Adrián Moreno <amore...@redhat.com >> <mailto:amore...@redhat.com> >> > <mailto:amore...@redhat.com <mailto:amore...@redhat.com>>> wrote: >> > >> > On Tue, May 06, 2025 at 11:48:07AM +0700, Trọng Đạt Trần wrote: >> > > Dear Adrián, >> > > >> > > Thank you for your response. I’ve applied your suggestion to use >> > separate >> > > sample entries for each ACL. However, I am still seeing >> unexpected >> > behavior >> > > in the IPFIX output that I’d like to clarify. >> > > Test Setup (Same as Before) >> > > >> > > vm_a ---- network1 ---- router ---- network2 ---- vm_b >> > > >> > > >> > > - >> > > >> > > Two ACLs: >> > > - >> > > >> > > ACL A: allow-related *outbound* IPv4 >> > > - >> > > >> > > ACL B: allow-related *inbound* ICMP >> > > - >> > > >> > > ACLs applied symmetrically to both VMs. >> > > - >> > > >> > > Test traffic: ICMP request from vm_b to vm_a, and reply from >> > vm_a to vm_b >> > > . >> > > >> > > Key Problem Observed >> > > >> > > When sampling is enabled on *both* ACLs, the IPFIX record for >> > *flow (3)* >> > > (the ICMP reply from vm_a → router) shows *120 packets/min*. >> > > >> > > However: >> > > >> > > - >> > > >> > > If *only ACL B* (inbound ICMP) is sampled → (3) = 60 >> packets/min >> > > - >> > > >> > > If *only ACL A* (outbound IP4) is sampled → (3) not present >> > > - >> > > >> > > If both are sampled → (3) = 120 packets/min >> > > >> > > This suggests that *flow (3) is being sampled twice* — even >> though it >> > > represents a *single logical flow and matches only ACL B*. >> > > IPFIX Observations >> > > FlowDescriptionExpectedActual >> > > (1) vm_b → router (ICMP request) 60 pkt/m 60 >> > > (2) router → vm_a (ICMP request) 60 pkt/m 60 >> > > (3) vm_a → router (ICMP reply) 60 pkt/m 120 ⚠️ >> > > (4) router → vm_b (ICMP reply) 60 pkt/m 60 >> > >> > This is not what I'd expect, maybe Dumitru knows? >> > >> > Could you attach ofproto/trace and ovn-detrce outputs from both >> > directions? >> > >> > Thanks. >> > Adrián >> > >> _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss