On 6/6/24 22:41, Ilya Maximets wrote:
> On 6/6/24 20:59, Sri kor via discuss wrote:
>> Hi Team,
>>
>> Currently we are facing /ERR|group-table: out of table ids .W/e are
>> running OVN 23.09 version and OVS 3.2.2. From the retis trace, the packet
>> appears to be dropped shortly after the upcall is generated. The exact
>> reason for the drop isn't specified, but it indicates that the packet is not
>> forwarded further within the OVS kernel datapath at this point.
>>
>> As per https://issues.redhat.com/browse/FDP-70 , this issue was fixed in OVN
>> 23.09.
>>
>>
>> Jun 05 21:21:05 vaeq-cu1a-r207-prod-hv-03.vaeq-cu.infra.cx systemd[1]:
>> Started OVN controller daemon.
>> Jun 05 21:21:05 vaeq-cu1a-r207-prod-hv-03.vaeq-cu.infra.cx
>> ovn-controller[1253156]: ovs|00023|extend_table|ERR|*table group-table: out
>> of table ids.*
>>
>>
>>
>> [root@cloud-user]#
>>
>> Jun 05 21:21:05 vaeq-cu1a-r207-prod-hv-03.vaeq-cu.infra.cx systemd[1]:
>> Started OVN controller daemon.
>>
>> Jun 05 21:21:05 vaeq-cu1a-r207-prod-hv-03.vaeq-cu.infra.cx
>> ovn-controller[1253156]: ovs|00023|extend_table|ERR|table group-table: out
>> of table ids.
>>
>>
>> #retis sort /tmp/playground-test1.json
>>
>>
>> 1306445529784444 (102) [swapper/102] 0 [tp] openvswitch:ovs_dp_upcall
>> #4a4348db8a07cff2a94af7da68c00 (skb ff2a93b2b82afd00) n 0
>> if 4 (enp148s0f0np0) rxif 4 91.107.186.166.55805 > 204.52.24.59.22 ttl 235
>> tos 0x0 id 54321 off 0 len 40 proto TCP (6) flags [S] seq 3425966139 win
>> 65535
>> upcall (miss) port 2774067634 cpu 102
>> * + 1306445529798391 (102) [swapper/102] 0 [tp] skb:kfree_skb
>> #4a4348db8a07cff2a94af7da68c00 (skb ff2a93b2b82afd00) n 1 drop (reason
>> NOT_SPECIFIED)*
>> * * if 4 (enp148s0f0np0) rxif 4
>> + 1306445529802935 (102) [swapper/102] 0 [kr] queue_userspace_packet
>> #4a4348db8a07cff2a94af7da68c00 (skb ff2a93b2b82afd00) n 2
>> if 4 (enp148s0f0np0) rxif 4 91.107.186.166.55805 > 204.52.24.59.22 ttl
>> 235 tos 0x0 id 54321 off 0 len 40 proto TCP (6) flags [S] seq 3425966139 win
>> 65535
>> upcall_enqueue (miss) (102/1306445529784444) q 1636019689 ret 0
>> + 1306445529807829 (102) [swapper/102] 0 [kr] ovs_dp_upcall
>> #4a4348db8a07cff2a94af7da68c00 (skb ff2a93b2b82afd00) n 3
>> if 4 (enp148s0f0np0) rxif 4 91.107.186.166.55805 > 204.52.24.59.22 ttl
>> 235 tos 0x0 id 54321 off 0 len 40 proto TCP (6) flags [S] seq 3425966139 win
>> 65535
>> upcall_ret (102/1306445529784444) ret 0
>>
>>
>> [root@cloud-user]# ovs-vsctl --version
>> ovs-vsctl (Open vSwitch) *3.2.2*
>> DB Schema 8.4.0
>>
>> [root@vcloud-user]# ovn-controller --version
>> ovn-controller *23.09.1*
>
> Are you building this package yourself? If so, on which exact commit it is
> based?
> If not, what distribution are you using and what is the exact rpm/deb package
> version?
>
> My suspicion is that it is not exactly v23.09.1, but a code a few commits
> earlier
> than that. In this case, it may not include the fix.
>
I agree, the versions listed above look a bit off. If I run OVN
v23.09.1 in a sandbox I get:
$ ovn-controller --version
ovn-controller 23.09.1
Open vSwitch Library 3.3.90 <<< this differs from 3.2.2 listed above
OpenFlow versions 0x6:0x6
SB DB Schema 20.29.
Checking when we bumped the OVS submodule from 3.2.2 to the tip (at that
moment) of 3.3, it was:
1fa7628db415 ("ovs: Bump submodule to include E721 fixes.")
The log between that version and the actual v23.09.1 release is:
$ git log --oneline 1fa7628db415..v23.09.1
0afd4e59e9 (HEAD, tag: v23.09.1) Set release date for 23.09.1.
<snip>
e9e716ad53 controller: Don't artificially limit group and meter IDs to 16bit.
<snip>
627955eb79 ci: Pin Python, Fedora and Ubuntu runner versions.
What we need is actually:
commit e9e716ad531e34766d2f02783ac08955096bf636
Author: Dumitru Ceara <[email protected]>
Date: Tue Oct 31 18:00:44 2023 +0100
controller: Don't artificially limit group and meter IDs to 16bit.
There were a few follow up fixes for it though:
acc63727d14f ("controller: fix group_table and meter_table allocation")
c0c9e5074704 ("features.c: Always wait on the rconn.")
40b670e6ee94 ("ovn-controller: Fix busy loop when ofctrl is disconnected.")
So I guess the recommendation would be to use the most recent v23.09 release,
that is: v23.09.4
Regards,
Dumitru
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss