On 5/9/25 2:14 PM, Dumitru Ceara wrote:
> On 5/9/25 5:38 AM, Trọng Đạt Trần wrote:
>> Hi Dimitru,
>>
> 
> Hi Oscar,
> 
> 
>> Thank you for pointing that out.
>>
>> To clarify: the terms “inbound” and “outbound” in my previous message
>> were used from the *VM’s perspective*.
>>
>>
>>       Topology:
>>
>> |vm_a ---- network1 ---- router ---- network2 ---- vm_b |
>>
>>
>>       ACLs:
>>
>>   *
>>
>>     *ACL A*: allow-related VMs to *send* IPv4 traffic (|direction=from-
>>     lport|)
>>
>>   *
>>
>>     *ACL B*: allow-related VMs to *receive* ICMP traffic (|direction=to-
>>     lport|)
>>
>> I’ve attached both the *Northbound and Southbound database dumps* to
>> ensure the full context is available.
>>
> 
> Thanks for the info, I tried locally with a simplified setup where I
> emulate your topology:
> 
> switch c9c171ef-849c-436d-b3f9-73d83b9c4e5d (ls)
>     port vm2
>         addresses: ["00:00:00:00:00:02"]
>     port vm1
>         addresses: ["00:00:00:00:00:01"]
> 
> Those two VIFs are in a port group:
> 
> # ovn-nbctl list port_group
> _uuid               : 7e7a96b9-e708-4eea-b380-018314f2435c
> acls                : [1d0e7b71-ff03-4c78-ace4-2448bf237e11,
> 7cb023e9-fee5-4576-a67d-ce1f5d98805b]
> external_ids        : {}
> name                : pg
> ports               : [d991baa6-21b0-4d46-a15d-71b9e8d6708d,
> f2c5679c-d891-4d34-8402-8bc2047fba61]
> 
> With two ACLs applied:
> # ovn-nbctl acl-list pg
> from-lport   100 (inport==@pg && ip4) allow-related
>   to-lport   200 (outport==@pg && ip4 && icmp4) allow-related
> 
> Both ACLs have only sampling for established traffic (sample_est) set:
> # ovn-nbctl list acl
> _uuid               : 1d0e7b71-ff03-4c78-ace4-2448bf237e11
> action              : allow-related
> direction           : from-lport
> match               : "inport==@pg && ip4"
> priority            : 100
> sample_est          : 23153fae-0a73-4f86-bdf2-137e76647da8
> sample_new          : []
> 
> _uuid               : 7cb023e9-fee5-4576-a67d-ce1f5d98805b
> action              : allow-related
> direction           : to-lport
> match               : "outport==@pg && ip4 && icmp4"
> priority            : 200
> sample_est          : 42391c82-23d2-4f2b-a7b9-88afaa68282c
> sample_new          : []
> 
> # ovn-nbctl list sample
> _uuid               : 23153fae-0a73-4f86-bdf2-137e76647da8
> collectors          : [82540855-dcd4-44e4-8354-e08a972500cd]
> metadata            : 2000000
> 
> _uuid               : 42391c82-23d2-4f2b-a7b9-88afaa68282c
> collectors          : [82540855-dcd4-44e4-8354-e08a972500cd]
> metadata            : 1000000
> 
> Then I send a single ICMP echo packet from vm2 towards vm1.  The ICMP
> echo hits both ACLs but because it's the packet initiating the session
> doesn't generate a sample (sample_new is not set in the ACLs).  Instead
> 2 conntrack entries are created for the ICMP session:
> 
> - one in the CT zone of vm2 - here the from-lport ACL is hit so the
> sample_est metadata of the from-lport ACL (200000) is stored along in
> the conntrack state
> 
> - one in the CT zone of vm1 - here the tolport ACL is hit so the
> sample_est metadata of the to-lport ACL (100000) is stored along in the
> conntrack state
> 
> The ICMP echo packet reaches vm1 which replies with ICMP ECHO Reply.
> 
> For the reply the CT zone of vm1 is first checked, we match the existing
> conntrack entry (its state moves to "established") and a sample for the
> stored metadata, 100000, is generated.  Then, in the egress pipeline,
> the CT zone of vm2 is checked, we match the other existing conntrack
> entry (its state also moves to "established") and a sample for the
> stored metadata, 200000, is generated.
> 
> This seems correct to me.  Stats also seem to confirm that:
> # ip netns exec vm2 ping 42.42.42.2 -c1
> PING 42.42.42.2 (42.42.42.2) 56(84) bytes of data.
> 64 bytes from 42.42.42.2: icmp_seq=1 ttl=64 time=1.46 ms
> 
> --- 42.42.42.2 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 1.455/1.455/1.455/0.000 ms
> 
> # ovs-ofctl dump-ipfix-flow br-int
> NXST_IPFIX_FLOW reply (xid=0x2): 1 ids
>   id   2: flows=2, current flows=0, sampled pkts=2, ipv4 ok=2, ipv6
> ok=0, tx pkts=11
>           pkts errs=0, ipv4 errs=0, ipv6 errs=0, tx errs=11
> 
> But then, when I increase the number of packets things become more
> interesting.  ICMP echos also generate samples.  And while that might
> seem like a bug, it's not. :)
> 
> When ping sends multiple packets for a single invocation it uses the
> same ICMP ID and just increments the ICMP seq, e.g.:
> 
> 14:07:41.986618 00:00:00:00:00:02 > 00:00:00:00:00:01, ethertype IPv4
> (0x0800), length 98: (tos 0x0, ttl 64, id 58647, offset 0, flags [DF],
> proto ICMP (1), length 84)
>     42.42.42.3 > 42.42.42.2: ICMP echo request, id 35717, seq 1, length 64
> 
> 14:07:42.988077 00:00:00:00:00:02 > 00:00:00:00:00:01, ethertype IPv4
> (0x0800), length 98: (tos 0x0, ttl 64, id 59085, offset 0, flags [DF],
> proto ICMP (1), length 84)
>     42.42.42.3 > 42.42.42.2: ICMP echo request, id 35717, seq 2, length 64
> 
> But conntrack doesn't use the ICMP ID in the key for the session it
> installs:

Sorry about the typo, I meant to say "conntrack doesn't use the ICMP SEQ
in the key for the session it installs, it only uses the ICMP ID".

> 
> # ovs-appctl dpctl/dump-conntrack | grep 42.42.42
> icmp,orig=(src=42.42.42.3,dst=42.42.42.2,id=35628,type=8,code=0),reply=(src=42.42.42.2,dst=42.42.42.3,id=35628,type=0,code=0),zone=4,mark=131104,labels=0xf4240000000000000000000000000
> icmp,orig=(src=42.42.42.3,dst=42.42.42.2,id=35628,type=8,code=0),reply=(src=42.42.42.2,dst=42.42.42.3,id=35628,type=0,code=0),zone=6,mark=131072,labels=0x1e8480000000000000000000000000
> 
> So, subsequent ICMP requests will match on these two existing
> established entries and (because sampling_est) is configured samples are
> generated for them too.
> 
> That's also visible in the datapath flows that forward packets in the
> "original" direction (ICMP ECHOs in our case):
> 
> # ovs-appctl dpctl/dump-flows | grep sample | grep '\-rpl'
> recirc_id(0x29),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20000/0xff0071),ct_label(0x1e8480000000000000000000000000),eth(src=00:00:00:00:00:02,dst=00:00:00:00:00:01),eth_type(0x0800),ipv4(proto=1,frag=no),
> packets:8, bytes:784, used:2.342s,
> actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554434,obs_point_id=2000000,output_port=4294967295)),ct(commit,zone=6,mark=0x20000/0xff0071,label=0x1e8480000000000000000000000000/0xffffffffffff00000000000000000000,nat(src)),ct(zone=4),recirc(0x2a)
> 
> recirc_id(0x2a),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0x20020/0xff0071),ct_label(0xf4240000000000000000000000000),eth(src=00:00:00:00:00:02,dst=00:00:00:00:00:00/ff:ff:00:00:00:00),eth_type(0x0800),ipv4(proto=1,frag=no),
> packets:8, bytes:784, used:2.342s,
> actions:userspace(pid=4294967295,flow_sample(probability=65535,collector_set_id=2,obs_domain_id=33554434,obs_point_id=1000000,output_port=4294967295)),ct(commit,zone=4,mark=0x20020/0xff0071,label=0xf4240000000000000000000000000/0xffffffffffff00000000000000000000,nat(src)),1
> 
> So, for a less complicated test, maybe you should try with UDP/TCP instead.
> 
> I hope that clarifies your doubts.
> 
> Best regards,
> Dumitru
> 
>> Best regards,
>>
>> Oscar
>>
>>
>> On Thu, May 8, 2025 at 8:11 PM Dumitru Ceara <dce...@redhat.com
>> <mailto:dce...@redhat.com>> wrote:
>>
>>     Hi Oscar,
>>
>>     On 5/6/25 12:31 PM, Trọng Đạt Trần wrote:
>>     > As requested, I’ve attached additional tracing information related to
>>     > the sampling duplication issue.
>>     >
>>     >   *
>>     >
>>     >     The file |ofproto_trace.log| contains the full output of |ofproto/
>>     >     trace| commands.
>>     >
>>     >   *
>>     >
>>     >     The archive |ovn-detrace.tar.gz| includes six separate files, each
>>     >     corresponding to an |ovn-detrace| output for a flow I believe is
>>     >     involved in the duplicated sampling.
>>     >
>>     > Since I’m not fully confident in how to use |--ct-next option|, I’ve
>>     > included traces for all six related flows to ensure completeness.
>>     >
>>     > Please let me know if you need further details, or if I should re-run
>>     > any commands with additional options.
>>     >
>>
>>     This seems fairly easy to reproduce locally for investigation; I didn't
>>     try yet though.  However, would you mind sharing your OVN NB database
>>     file (I'm assuming this is a test environment)?
>>
>>     I would like to make sure we don't have any misunderstanding because the
>>     terms you use below in your ACL description (e.g., "outbound"/"inbound")
>>     are not standard terms.  Having the actual ACL (and the rest of the NB)
>>     contents will make it easier to debug.
>>
>>     Thanks,
>>     Dumitru
>>
>>     > Best regards,
>>     >
>>     > *Oscar*
>>     >
>>     >
>>     > On Tue, May 6, 2025 at 4:15 PM Adrián Moreno <amore...@redhat.com
>>     <mailto:amore...@redhat.com>
>>     > <mailto:amore...@redhat.com <mailto:amore...@redhat.com>>> wrote:
>>     >
>>     >     On Tue, May 06, 2025 at 11:48:07AM +0700, Trọng Đạt Trần wrote:
>>     >     > Dear Adrián,
>>     >     >
>>     >     > Thank you for your response. I’ve applied your suggestion to use
>>     >     separate
>>     >     > sample entries for each ACL. However, I am still seeing
>>     unexpected
>>     >     behavior
>>     >     > in the IPFIX output that I’d like to clarify.
>>     >     > Test Setup (Same as Before)
>>     >     >
>>     >     > vm_a ---- network1 ---- router ---- network2 ---- vm_b
>>     >     >
>>     >     >
>>     >     >    -
>>     >     >
>>     >     >    Two ACLs:
>>     >     >    -
>>     >     >
>>     >     >       ACL A: allow-related *outbound* IPv4
>>     >     >       -
>>     >     >
>>     >     >       ACL B: allow-related *inbound* ICMP
>>     >     >       -
>>     >     >
>>     >     >    ACLs applied symmetrically to both VMs.
>>     >     >    -
>>     >     >
>>     >     >    Test traffic: ICMP request from vm_b to vm_a, and reply from
>>     >     vm_a to vm_b
>>     >     >    .
>>     >     >
>>     >     > Key Problem Observed
>>     >     >
>>     >     > When sampling is enabled on *both* ACLs, the IPFIX record for
>>     >     *flow (3)*
>>     >     > (the ICMP reply from vm_a → router) shows *120 packets/min*.
>>     >     >
>>     >     > However:
>>     >     >
>>     >     >    -
>>     >     >
>>     >     >    If *only ACL B* (inbound ICMP) is sampled → (3) = 60
>>     packets/min
>>     >     >    -
>>     >     >
>>     >     >    If *only ACL A* (outbound IP4) is sampled → (3) not present
>>     >     >    -
>>     >     >
>>     >     >    If both are sampled → (3) = 120 packets/min
>>     >     >
>>     >     > This suggests that *flow (3) is being sampled twice* — even
>>     though it
>>     >     > represents a *single logical flow and matches only ACL B*.
>>     >     > IPFIX Observations
>>     >     > FlowDescriptionExpectedActual
>>     >     > (1) vm_b → router (ICMP request) 60 pkt/m 60
>>     >     > (2) router → vm_a (ICMP request) 60 pkt/m 60
>>     >     > (3) vm_a → router (ICMP reply) 60 pkt/m 120 ⚠️
>>     >     > (4) router → vm_b (ICMP reply) 60 pkt/m 60
>>     >
>>     >     This is not what I'd expect, maybe Dumitru knows?
>>     >
>>     >     Could you attach ofproto/trace and ovn-detrce outputs from both
>>     >     directions?
>>     >
>>     >     Thanks.
>>     >     Adrián
>>     >
>>

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to