Hi team, Recently, we face a weird issue on DNS resolv using OVN.
Case describe: 1. There is a VM instance or a VM cluster(3 VMs)on our deployment for a long time, and we enable the DNS via neutron-ovn. 2. The DNS resolve works as expected in our internal network for a while. 3. Suddentlly, the DNS resolve failed on the said VM or one of the cluster nodes. Notes: a. I had confirmed that the SouthBound DB contains the said DNS records for a long time. And we didn't change anything before the issue happened. b. For the cluster nodes(3 VMs), all VMs locate on different compute nodes(Chassis). But there is only 1 VM DNS resolution failure in our case. c. I traced the logic flow, and tcpdump the DNS traffic. That's true the DNS resp is generating by OVN and get 0 record which worked well before. How to resolve: Trigger the whole local ovn-controller to refresh the DNS records on its DNS local cache. What we found is live-migration of the error VM. Our question is: 1. Why DNS resolution failed on the local ovn-controller of compute node Chassis? a. Did the DNS local cache fail to sync with SouthboundDB? b. The DNS local cache MEM size is not limit, right? c. How to trace the DNS local cache on a running ovn-controller? I didn't find any CLI interface for it. d. Is there any suggestion for avoiding this issue? I'm failed to find any usable config opts for ovn-controller. If I miss some config options, please leave your kind suggestion. 2. Is there other scenario that might raise DNS resolution failure with no change? I mean from the scale deployment perspective and just maintenance the existing DNS. Thanks Best Regards, Bo Zhao
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss