I have tried both OVS v2.17.8 and OVS v3.2.1 versions , and have been able to 
reproduce this issue.

[root@localhost kaiyuan]# ovs-vsctl -V
ovs-vsctl (Open vSwitch) 3.2.1
DB Schema 8.4.0

[root@localhost kaiyuan]# ovs-appctl bond/show
---- eobond ----
bond_mode: balance-tcp
bond may use recirculation: yes, Recirc-ID : 1
bond-hash-basis: 0
lb_output action: disabled, bond-id: -1
updelay: 0 ms
downdelay: 0 ms
next rebalance: 6629 ms
lacp_status: negotiated
lacp_fallback_ab: false
active-backup primary: <none>
active member mac: 98:a9:2d:c5:00:69(u0)

member u0: enabled
  active member
  may_enable: true
  hash 223: 9007199254740988 kB load

member u1: enabled
  may_enable: true
  hash 175: 9007199254739468 kB load

Best regards, Weiwei Zhang

-----Original email-----
From: Ilya Maximets [mailto:i.maxim...@ovn.org]
Send time: 2/1/2024 10:27
To: zhangweiwei (RD) <zhang.wei...@h3c.com>; b...@openvswitch.org
Cc: Eelco Chaudron <echau...@redhat.com>; i.maxim...@ovn.org
Subject: Re: [ovs-discuss] bond: bond entry statistics overflow

> On 1/31/24 10:29, Zhangweiwei via discuss wrote:
> Hi,
>
> I encountered an issue while using OVS bond in balance-tcp mode.
> After performing down and up operations on bond members, the bond
> entry statistics displayed by ovs-appctl bond/show occured overflow.
> In addition to the statistics values issue, this also led to longer
> load balancing time for bond members.
>
> 1、information:
> ovs version:2.17.2
> bond mode: balance-tcp
> openflow: cookie=0x0, duration=7027.270s, table=0, n_packets=15169077,
> n_bytes=9334457220, priority=0 actions=NORMAL
> datapath_type: netdev
>
> 2、ovs-appctl bond/show print:
>
> [root@localhost zzz]# ovs-appctl bond/show
> ---- eobond ----
> bond_mode: balance-tcp
> bond may use recirculation: yes, Recirc-ID : 1
> bond-hash-basis: 0
> lb_output action: disabled, bond-id: -1
> updelay: 0 ms
> downdelay: 0 ms
> next rebalance: 9673 ms
> lacp_status: negotiated
> lacp_fallback_ab: false
> active-backup primary: <none>
> active member mac: 98:a9:2d:c5:00:69(u0)
>
> member u0: enabled
>   active member
>   may_enable: true
>   hash 89: 9007199254740413 kB load
>   hash 219: 9007199254740991 kB load
>
> member u1: enabled
>   may_enable: true
>   hash 141: 9007199254520657 kB load
>
> 3、analysis:
>
> After performing down and up operations on bond members, recirc rules
> are
> changed,bond_entry_account( ) function updates bond entry statistics
> through recirc rules. rule_tx_bytes  <  entry->pr_tx_bytes , so delta occurs 
> overflow.

So, the main issue here seems to be that the statistics on the rule itself 
jumps back for some reason.  There were a few patches in the past year or so 
that fix several occurrences of similar statistics issues during flow dumps.  
Can you reproduce the issue on the latest
v2.17.8 release?  2.17.2 is fairly old and coent contain many important bug 
fixes.  Newer releases like 3.1.2+ also have enhanced logging around statistics 
mishaps, so they are easier to debug.

Best regards, Ilya Maximets.

>
> static void bond_entry_account (struct bond_entry *entry, uint64_t
> rule_tx_bytes)
> OVS_REQ_WRLOCK(rwlock) {
>    if (entry->member) {
>         uint64_t delta;
>         delta = rule_tx_bytes - entry->pr_tx_bytes;    // delta occurs
> overflow
>         entry->tx_bytes += delta;
>         entry->pr_tx_bytes = rule_tx_bytes;
>     }
> }
>
> 4、solution
>
> I try to add last_pr_rule in struct bond_entry to solve this problem.
> When then recirc rule changes, delta = rule_tx_bytes, and entry->tx_bytes += 
> rule_tx_bytes.
> But I’m not sure whether the value of entry->tx_bytes is correct after the 
> modification.
>
> index ddc96a4..7b14d53 100644
> --- a/openvswitch-2.17.2/ofproto/bond.c
> +++ b/openvswitch-test/ofproto/bond.c
> @@ -71,6 +71,7 @@ struct bond_entry {
>       * 'pr_tx_bytes' is the most recently seen statistics for
> 'pr_rule', which
>       * is used to determine delta (applied to 'tx_bytes' above.) */
>      struct rule *pr_rule;
> +    struct rule *last_pr_rule;
>      uint64_t pr_tx_bytes OVS_GUARDED_BY(rwlock); };
>
> @@ -990,8 +991,12 @@ bond_entry_account(struct bond_entry *entry,
> uint64_t rule_tx_bytes)
>      if (entry->member) {
>          uint64_t delta;
>
> -        delta = rule_tx_bytes - entry->pr_tx_bytes;
> -        entry->tx_bytes += delta;
> +       if (entry->last_pr_rule != entry->pr_rule) {
> +           entry->tx_bytes += rule_tx_bytes;
> +       } else {
> +            delta = rule_tx_bytes - entry->pr_tx_bytes;
> +            entry->tx_bytes += delta;
> +       }
>          entry->pr_tx_bytes = rule_tx_bytes;
>      }
> }
> @@ -1027,6 +1032,7 @@ bond_recirculation_account(struct bond *bond)
>              continue;
>          }
>          bond_entry_account(entry, stats.n_bytes);
> +       entry->last_pr_rule=rule;
>      }
> }

-------------------------------------------------------------------------------------------------------------------------------------
本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from New H3C, 
which is
intended only for the person or entity whose address is listed above. Any use 
of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender
by phone or email immediately and delete it!
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to