On 28/10/2021 16:41, Numan Siddique wrote:
On Thu, Oct 28, 2021 at 5:20 AM Brendan Doyle <[email protected]> wrote:
Numan,

Just wondering if you got  a chance to look at those logs?
I looked into the logs,  and as I had mentioned earlier you need this
fix - 
https://urldefense.com/v3/__https://github.com/ovn-org/ovn/commit/e7788554a7f5e824fc0d8afc6cbf20e94fe4245f__;!!ACWV5N9M2RV99hQ!amdtq3tQhwFCtbvjxSuF5ItzNk_07I0bBJvt5mu3lbJc-NBU5rsCp9IIullXTrxBXf8$

Please let me know if you still see this issue with the latest OVN or
with the version of OVN which has this fix.
This fix is available from OVN 21.03 and onwards.

So we have verified with OVN 21.09.0 and OVS 2.16.90 that this does indeed fix the
"runaway" conf.db issue.

But I have a question, I still see conf.db growing over time, though in much smaller increments. Does this DB ever reduce? I mean at present my NB is empty and a very small SB (see below). Yet my chassis conf.db's steadily grew from a few KB to over a MB whilst I created/deleted various switches, routers and gateways for example:

ls -lh /etc/openvswitch/conf.db
-rw-r-----. 1 root root 1.6M Nov  9 10:41 /etc/openvswitch/conf.db

Will this file just always grow and grow, and eventually get to GBs. Or does
it ever reduce when switches, gateways and routers are deleted?


#ovn-nbctl show
#


# ovn-sbctl show
Chassis "b4ba6c5b-0c85-4db7-adeb-a722ced41fac"
    hostname: pcamn01
    Encap geneve
        ip: "253.255.0.33"
        options: {csum="true"}
Chassis pcacn005
    hostname: pcacn005
    Encap geneve
        ip: "253.255.2.68"
        options: {csum="true"}
Chassis pcacn002
    hostname: pcacn002.ovca2.us.oracle.com
    Encap geneve
        ip: "253.255.2.65"
        options: {csum="true"}
Chassis "48aa3fd0-1d0f-4c6a-a444-64e96a80ce72"
    hostname: pcamn03
    Encap geneve
        ip: "253.255.0.35"
        options: {csum="true"}
Chassis pcacn001
    hostname: pcacn001.ovca2.us.oracle.com
    Encap geneve
        ip: "253.255.2.64"
        options: {csum="true"}
Chassis pcacn003
    hostname: pcacn003.ovca2.us.oracle.com
    Encap geneve
        ip: "253.255.2.66"
        options: {csum="true"}
Chassis "07eecb27-cd68-46d6-83ff-cb1bb84b85f3"
    hostname: pcamn02
    Encap geneve
        ip: "253.255.0.34"
        options: {csum="true"}


Thanks
Numan

Thanks

Brendan

On 27/10/2021 11:25, Brendan Doyle wrote:

Hi,

I finally got some debug logs, truncated after the failure occurs, the 
truncated entries just
are repeated updates of the same entry.

So some more light on this, It seems this is a timing issue. The test being run 
involves
creating  a number of Logical switches (LS), Routers (LR) and Distributed 
Router Port
gateways (DR). And then immediately deleting them, with the last created DR 
being
deleted first. Our CMs is using the ovsdbapp python lib to do this.

So it occurs to me that perhaps the objects get created in NB, but before they 
have been
propagated to SB and to the HV chassis, we get the delete, and this causes 
updates to
be sent to the chassis for a logical port that does not exist? Just a 
hypothesis.

The ovn-nbctl has synchronization flags (--wait) to guard against such 
behavior, does
ovsdbapp I wonder?

In any-case the test fails (we see a runaway conf.db) pretty regularly, but not 
every time.
The failure is always observed on the delete operations. If I put a delay after 
create and
before delete, then we don't see the failure.

If anyone can shed light on this from the logs would be much appreciated.

Thanks

Brendan







On 26/10/2021 17:11, Brendan Doyle wrote:



On 26/10/2021 15:50, Numan Siddique wrote:

On Tue, Oct 26, 2021 at 8:20 AM Brendan Doyle <[email protected]> wrote:

Hi,


So what is very odd here, is that I have used ovn-nbctl to delete the NB
config, so
# ovn-nbctl show
# ovn-sbctl lflow-list

Yet I still see /etc/openvswitch/conf.db growing with updates for
Logical switch ports that no longer exist!

"],["ct-zone-ln-ls_vcn9195577_external_ugw","220"],["ct-zone-ln-ls_vcn9206002_external_igw","110"],["ct-zone-ln-ls_vcn9210052_external_igw","110"],["ct-zone-ln-ls_vcn9232395_external_ugw","75"],["ct-zone-ln-ls_vcn9236987_external_igw","110"],["ct-zone-ln-ls_vcn9236987_external_ugw","78"],["ct-zone-ln-ls_vcn9255861_external_igw","118"],["ct-zone-ln-ls_vcn9255861_external_ugw","100"],["ct-zone-ln-ls_vcn9319435_external_igw","87"],["ct-zone-ln-ls_vcn9352502_external_igw","40"],["ct-zone-ln-ls_vcn9402504_external_ugw","99"],["ct-zone-ln-ls_vcn9403404_external_igw","133"],["ct-zone-ln-ls_vcn9403404_external_ugw","114"],["ct-zone-ln-ls_vcn9461566_external_ugw","191"],["ct-zone-ln-ls_vcn9480000_external_igw","254"],["ct-zone-ln-ls_vcn9480000_external_ugw","236"],["ct-zone-ln-ls_vcn9492134_external_igw","262"],["ct-zone-ln-ls_vcn9523503_external_igw","207"],["ct-zone-ln-ls_vcn9542102_external_igw","133"],["ct-zone-ln-ls_vcn9542102_external_ugw","115"],["ct-zone-ln-ls_vcn9559658_external_igw","125"],["ct-zone-ln-ls_vcn9559658_external_ugw","78"],["ct-zone-ln-ls_vcn9594034_external_igw","49"],["ct-zone-ln-ls_vcn9619021_external_igw","133"],["ct-zone-ln-ls_vcn9634773_external_igw","292"],["ct-zone-ln-ls_vcn9649169_external_igw","132"],["ct-zone-ln-ls_vcn9649169_external_ugw","110"],["ct-zone-ln-ls_vcn9661290_external_ugw","78"],["ct-zone-ln-ls_vcn9734192_external_ugw","114"],["ct-zone-ln-ls_vcn9774252_external_igw","262"],["ct-zone-ln-ls_vcn9796262_external_igw","72"],["ct-zone-ln-ls_vcn9796262_external_ugw","54"],["ct-zone-ln-ls_vcn9805903_external_igw","147"],["ct-zone-ln-ls_vcn9805903_external_ugw","126"],["ct-zone-ln-ls_vcn9809895_external_igw","246"],["ct-zone-ln-ls_vcn9812576_external_ugw","78"],["ct-zone-ln-ls_vcn9834728_external_igw","110"],["ct-zone-ln-ls_vcn9886683_external_ugw","114"],["ct-zone-ln-ls_vcn9903419_external_ugw","235"],["ct-zone-ln-ls_vcn9917510_external_igw","56"],["ct-zone-ln-ls_vcn9917510_external_ugw","38"]]]}},"_comment":"ovn-controller:
modifying OVS tunnels 'pcacn001'"}

A shortened version of one entry Could it be that switch ports must be
deleted before
deleting the switch? I was under the impression once a switch is deleted
it's ports get deleted?

Yes.  If you delete the switch,  the switch ports get deleted too.

After deleting the logical switch (or switch ports) do you see them to
be deleted by
ovn-northd in SB DB ?

Run - ovn-sbctl list port_binding <deleted_port>
or/and

ovn-sbctl list datapath_binding <deleted_lswitch>

I'd suggest you enable jsonrpc debug in ovn-controller and see what's happening.
It would be helpful if you can share the ovn-controller debug logs.

ovn-appctl -t ovn-controller vlog/set jsonrpc:dbg



So in my test I create a simple network then delete it so NB DB and SB DB
are empty.

# ovn-sbctl list port_binding
# ovn-sbctl list datapath_binding
#

The network has a number of LS's and LR's and two Distributed Router (DR) ports 
(on
separate LRs).  When I just create one DR all seems fine, but when I add the 
second into
the mix I get a runaway openvswitch/conf.db but NOT on all chassis. I  have 4 
chassis
that I can schedule  the DR ports to. In this latest test I observed  the 
runaway conf.db
on pcacn003 & pcacn005. The logs are too large to send in email, is there an 
ftp server
that I can upload to?

I will redo with debug  enabled and collect updated logs. The conf.db on both 
pcacn003 &
pcacn005 is several GBs.


The only way to recover is to stop the OVS/OVN procs, then delete 
/etc/openvswitch/conf.db
and restart them.

Brendan




Thanks
Numan


switch 712757c3-2481-4f8b-940c-05dc13ce37a5 (ls_vcn9319435_external_ugw)
       port ls_vcn9319435_external_ugw-lr_vcn9319435
           type: router
           router-port: lr_vcn9319435-ls_vcn9319435_external_ugw
       port ln-ls_vcn9319435_external_ugw
           type: localnet
           addresses: ["unknown"]

router 80c281af-319b-416b-8a17-0ce7b8901bb1 (lr_vcn9319435)
       port lr_vcn9319435-ls_vcn9319435_external_ugw
           mac: "00:13:97:88:31:90"
           networks: ["253.255.80.4/16"]
           gateway chassis: [pcacn002 pcacn003 pcacn001]
       port lr_vcn9319435-lsb_vcn9319435
           mac: "00:13:97:d4:26:ec"
           networks: ["253.255.29.2/25"]
       nat 6c87050f-cd27-423e-815e-deda74bd9bc6
           external ip: "253.255.80.4"
           logical ip: "10.221.0.0/16"
           type: "snat"

Do each port have to be deleted or is it ok to just delete the switch
and router?

Brendan

On 25/10/2021 16:10, Brendan Doyle wrote:


On 25/10/2021 15:08, Numan Siddique wrote:

On Fri, Oct 22, 2021 at 9:30 AM Brendan Doyle
<[email protected]> wrote:

Hi,


Looking at /etc/openvswitch/conf.db I see it getting very large:

[root@pcacn001 ~]#  ls -l /etc/openvswitch/conf.db
-rw-r--r--. 1 root root 6069248828 Oct 22 11:55
/etc/openvswitch/conf.db

And has lots and lots (mostly)  "ovn-controller: modifying OVS tunnels"
updates entries, like below.
What are these? it does not seem normal?
OVSDB JSON 4687 00e8788dd5d9af2aac5ca7724759017c52ddd580
{"_date":1634903752117,"Bridge":{"745726c4-0451-4f52-a52b-1f9c5e85c703":{"external_ids":["map",[["ct-zone-0dca7370-1c18-4117-84e4-a72f277ccc6c_dnat","4"],["ct-zone-0dca7370-1c18-4117-84e4-a72f277ccc6c_snat","1"],["ct-zone-11637f38-8725-4c77-adfe-f9c4c804ae8c_dnat","4"],["ct-zone-11637f38-8725-4c77-adfe-f9c4c804ae8c_snat","5"],["ct-zone-1de487d1-f3a5-4b15-bae4-aa8cf794fcf9_dnat","17"],["ct-zone-1de487d1-f3a5-4b15-bae4-aa8cf794fcf9_snat","7"],["ct-zone-22c71c2a-0e59-41cc-a2da-91d3c7276c11_dnat","9"],["ct-zone-22c71c2a-0e59-41cc-a2da-91d3c7276c11_snat","10"],["ct-zone-3228b120-4192-476b-ab67-51fb45e786d6_dnat","3"],["ct-zone-3228b120-4192-476b-ab67-51fb45e786d6_snat","4"],["ct-zone-3753ff1a-d0cf-48e4-b06a-640f0467d202_dnat","19"],["ct-zone-3753ff1a-d0cf-48e4-b06a-640f0467d202_snat","18"],["ct-zone-3c1c02f4-31c9-45d4-9c63-54ad2122bb15_dnat","10"],["ct-zone-3c1c02f4-31c9-45d4-9c63-54ad2122bb15_snat","16"],["ct-zone-423896cb-5573-4c54-b6e2-38f192eacae3_dnat","9"],["ct-zone-423896cb-5573

-4c54-b6e2-38f192eacae3_snat","12"],["ct-zone-46b7b247-31a7-4fbb-88b9-0f3db042409c_dnat","10"],["ct-zone-46b7b247-31a7-4fbb-88b9-0f3db042409c_snat","11"],["ct-zone-51376927-fca0-49b3-b0ba-1aa22153b366_dnat","2"],["ct-zone-51376927-fca0-49b3-b0ba-1aa22153b366_snat","5"],["ct-zone-58033baa-916d-47d4-bcf0-d95f7fb1f861_dnat","18"],["ct-zone-58033baa-916d-47d4-bcf0-d95f7fb1f861_snat","3"],["ct-zone-5f92f974-f0dc-4820-bb43-a14cc16d851f_dnat","12"],["ct-zone-5f92f974-f0dc-4820-bb43-a14cc16d851f_snat","11"],["ct-zone-87055326-0535-4042-a0ff-bf0e9f494433_dnat","10"],["ct-zone-87055326-0535-4042-a0ff-bf0e9f494433_snat","12"],["ct-zone-8a840bfe-118f-4041-ac72-0637d6373ffc_dnat","1"],["ct-zone-8a840bfe-118f-4041-ac72-0637d6373ffc_snat","11"],["ct-zone-8fff9b0b-0fd6-42f9-ab77-e9f1475a5d82_dnat","2"],["ct-zone-8fff9b0b-0fd6-42f9-ab77-e9f1475a5d82_snat","13"],["ct-zone-913c36a1-f987-4084-9119-f279b317c72f_dnat","11"],["ct-zone-913c36a1-f987-4084-9119-f279b317c72f_snat","12"],["ct-zone-9498aca9-762

3-4ce0-a0ff-d4d5c17d7223_dnat","19"],["ct-zone-9498aca9-7623-4ce0-a0ff-d4d5c17d7223_snat","15"],["ct-zone-9c373522-fd02-424f-a2b3-14dc359062d2_dnat","18"],["ct-zone-9c373522-fd02-424f-a2b3-14dc359062d2_snat","17"],["ct-zone-a28b45db-2dfb-4d38-905c-c5eb44da8c9c_dnat","13"],["ct-zone-a28b45db-2dfb-4d38-905c-c5eb44da8c9c_snat","10"],["ct-zone-b1e8636a-5cf8-48ba-9693-793a59e5430d_dnat","8"],["ct-zone-b1e8636a-5cf8-48ba-9693-793a59e5430d_snat","14"],["ct-zone-bbcc6e17-ee1e-4e82-b404-1dd0f1307002_dnat","12"],["ct-zone-bbcc6e17-ee1e-4e82-b404-1dd0f1307002_snat","11"],["ct-zone-bd3b86b7-2aba-4ff7-a5f7-975612692aca_dnat","13"],["ct-zone-bd3b86b7-2aba-4ff7-a5f7-975612692aca_snat","10"],["ct-zone-cb94affd-f2aa-4bdd-9407-1e16ac046596_dnat","9"],["ct-zone-cb94affd-f2aa-4bdd-9407-1e16ac046596_snat","1"],["ct-zone-ce71f6db-4dab-41ca-bd10-cd6204687b9d_dnat","16"],["ct-zone-ce71f6db-4dab-41ca-bd10-cd6204687b9d_snat","15"],["ct-zone-cfa46699-cc79-445e-a902-f1e37ff99806_dnat","5"],["ct-zone-cfa46699-c

c79-445e-a902-f1e37ff99806_snat","2"],["ct-zone-cr-lr_vcn0747157-ls_vcn0747157_external_ugw","9"],["ct-zone-cr-lr_vcn1645571_igw-ls_vcn1645571_external_igw","21"],["ct-zone-cr-lr_vcn7319607-ls_vcn7319607_external_ugw","14"],["ct-zone-cr-lr_vcn7319607_igw-ls_vcn7319607_external_igw","21"],["ct-zone-cr-lr_vcn7395327_igw-ls_vcn7395327_external_igw","21"],["ct-zone-cr-lr_vcn9567153-ls_vcn9567153_external_ugw","1"],["ct-zone-d0232f68-8d26-454c-87bf-e79066a1ed62_dnat","9"],["ct-zone-d0232f68-8d26-454c-87bf-e79066a1ed62_snat","8"],["ct-zone-d161aaef-e73e-452c-9d77-f465718f1f67_dnat","3"],["ct-zone-d161aaef-e73e-452c-9d77-f465718f1f67_snat","6"],["ct-zone-e2f0a229-15b0-4255-b52d-71b078239ed2_dnat","12"],["ct-zone-e2f0a229-15b0-4255-b52d-71b078239ed2_snat","13"],["ct-zone-e6986bf4-e813-4df0-9bfe-1de95ceb2e30_dnat","15"],["ct-zone-e6986bf4-e813-4df0-9bfe-1de95ceb2e30_snat","14"],["ct-zone-e93b7a93-8507-4036-8281-f2be764a44da_dnat","16"],["ct-zone-e93b7a93-8507-4036-8281-f2be764a44da_snat","17

"],["ct-zone-f3b9843a-d498-41dc-8244-0f87d9bc1384_dnat","6"],["ct-zone-f3b9843a-d498-41dc-8244-0f87d9bc1384_snat","7"],["ct-zone-f42fcb51-0af6-426f-974b-1478a169a70c_dnat","13"],["ct-zone-f42fcb51-0af6-426f-974b-1478a169a70c_snat","11"],["ct-zone-f708c12e-34b6-4657-b7d0-4b5ac5e0d6c7_dnat","20"],["ct-zone-f708c12e-34b6-4657-b7d0-4b5ac5e0d6c7_snat","19"],["ct-zone-ln-ls_vcn6603036_external_ugw","7"],["ct-zone-ln-ls_vcn7319607_external_igw","20"],["ct-zone-ln-ls_vcn7395327_external_ugw","7"],["ct-zone-ln-ls_vcn7836024_external_igw","20"],["ct-zone-ln-ls_vcn9567153_external_igw","21"],["ct-zone-ln-ls_vcn9567153_external_ugw","8"]]]}},"_comment":"ovn-controller:

modifying OVS tunnels 'pcacn001'"}

In which OVN version are you seeing this ?

ovs-vsctl -V
ovs-vsctl (Open vSwitch) 2.14.0_r0.0.0
DB Schema 8.2.0
# ovn-nbctl -V
ovn-nbctl 20.09.0_r1.0.0
Open vSwitch Library 2.14.0
DB Schema 5.27.0



I wonder if you're seeing this issue -
https://urldefense.com/v3/__https://github.com/ovn-org/ovn/commit/e7788554a7f5e824fc0d8afc6cbf20e94fe4245f__;!!ACWV5N9M2RV99hQ!bwIWH-KoNwkjzx2Sw8BLj6uGXg6zeGUoB-ZG4wtzO42NUmxA95Id3NxKLRgReUsdtEU$

Have to step out for a bit will look at this when I can
What I can say is that we are using ovsdbapp to configure central, and
I see /etc/openvswitch/conf.db

getting up to several Gb! so much so that systemd times out when you
try start the service using it.
I am also seeing ovs-vswitchd getting a SEGV on a regular basis which
I think is related.
I wondering if this patch might help

[External] : Re: [ovs-dev] [PATCH branch-2.14] python:
                idl: Avoid sending transactions when the DB is not synced
                up.

I'm not sure.   /etc/openvswitch/conf.db is the local ovsdb-server database
and not the OVN database.

Numan

If you run a tail on /etc/openvswitch/conf.db, do you see the ct zone
ids toggling between 2 values constantly ?

Thanks
Numan

Thanks

Brendan
_______________________________________________
discuss mailing list
[email protected]
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!bwIWH-KoNwkjzx2Sw8BLj6uGXg6zeGUoB-ZG4wtzO42NUmxA95Id3NxKLRgR-G4xGfo$

_______________________________________________
discuss mailing list
[email protected]
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!cR934SfxrIJu507dsVUIyZ7JHH9WWkNjqT4uWiSsnnfk72lkytha0jMrSq39KbktpyU$


_______________________________________________
discuss mailing list
[email protected]
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!aXU0ishuScB8BUBe7ocXxXDlPWZCYdhri_dfVWZN8rSI68YA6J3XGRVlo1SQy9umVfs$


_______________________________________________
discuss mailing list
[email protected]
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!c1HxNgHI2KosY03K_FFa5GpfOez9mAgB_8fm8G8Z-hCxG9RpSlq-pE8OO1R0lILyU-k$




_______________________________________________
discuss mailing list
[email protected]
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!fD4xiCtsxdVfl4DnJx7GuPacUj3Tt3j19-f571D1i2v_sJfL7xvt0W_aJeZva9Y7nh8$


_______________________________________________
discuss mailing list
[email protected]
https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!ACWV5N9M2RV99hQ!amdtq3tQhwFCtbvjxSuF5ItzNk_07I0bBJvt5mu3lbJc-NBU5rsCp9IIullXJ6POWWk$

_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to