Thanks Hrushi.

After further troubleshooting, we found that somehow openvswitch-agent was
reading a file named "sensu" from /etc/sudoers.d directory and was failing
in reading it, and we haven't configured anything like that in neutron
configs. Removing that helped in bringing everything back to normal, but
still its not clear why it was reading that file. We are trying to figure
that out.

*Rahul Sharma*
*MS in Computer Science, 2016*
College of Computer and Information Science, Northeastern University
Mobile:  801-706-7860
Email: rahulsharma...@gmail.com

On Mon, Mar 14, 2016 at 9:10 PM, Gangur, Hrushikesh <
hrushikesh.gan...@hpe.com> wrote:

> Rahul – it seems your issue is similar to the one reported here, probably
> due to hostname resolution issue.
>
> https://bugs.launchpad.net/charms/+source/quantum-gateway/+bug/1405588
>
>
>
> Regards~hrushi
>
>
>
> *From:* Rahul Sharma [mailto:rahulsharma...@gmail.com
> <rahulsharma...@gmail.com>]
> *Sent:* Monday, March 14, 2016 3:32 PM
> *To:* openstack <openstack@lists.openstack.org>; OpenStack Development
> Mailing List <openstack-...@lists.openstack.org>;
> openstack-operat...@lists.openstack.org
> *Subject:* [Openstack-operators] [neutron] openvswitch-agent spins up too
> many /bin/ovsdb-client processes
>
>
>
> Hi All,
>
>
>
> We are trying to debug an issue with our production environment. We are
> seeing neutron-openvswitch-agent starts failing after some time (1-2 days).
> After debugging, we found that there are large number of entries for the
> ovsdb-client. On some nodes, it crosses more than 330 processes and then
> ovsdb process starts failing.
>
> 1.  root     30689     1  0 00:37 ?        00:00:00 /bin/ovsdb-client
> monitor Interface name,ofport --format=json
>
> 2.  root     30804     1  0 00:38 ?        00:00:00 /bin/ovsdb-client
> monitor Interface name,ofport --format=json
>
> 3.  root     30909     1  0 00:38 ?        00:00:00 /bin/ovsdb-client
> monitor Interface name,ofport --format=json
>
>
>
> Pastebin link for the processes: http://pastebin.com/QGQC0Jrt
>
> Pastebin link with openvswitch starting all of them:
> http://pastebin.com/repHMkHu
>
>
>
> In logs, we start getting errors as:-
>
> Mar 14 05:41:29 node2 ovs-vsctl: ovs|00001|fatal_signal|WARN|terminating
> with signal 14 (Alarm clock)
>
> Mar 14 05:41:39 node2 ovs-vsctl: ovs|00001|fatal_signal|WARN|terminating
> with signal 14 (Alarm clock)
>
> Mar 14 05:41:49 node2 ovs-vsctl: ovs|00001|fatal_signal|WARN|terminating
> with signal 14 (Alarm clock)
>
> Mar 14 05:49:30 node2 ovs-vsctl:
> ovs|00001|vsctl|ERR|unix:/var/run/openvswitch/db.sock: database connection
> failed (Protocol error)
>
> Mar 14 05:49:32 node2 ovs-vsctl:
> ovs|00001|vsctl|ERR|unix:/var/run/openvswitch/db.sock: database connection
> failed (Protocol error)
>
> Mar 14 05:49:34 node2 ovs-vsctl:
> ovs|00001|vsctl|ERR|unix:/var/run/openvswitch/db.sock: database connection
> failed (Protocol error)
>
>
>
> Openvswitch version:-
>
> [root@node2 ~(openstack_admin)]# ovs-vsctl --version
>
> ovs-vsctl (Open vSwitch) 2.4.0
>
> Compiled Sep  4 2015 09:49:34
>
> DB Schema 7.12.1
>
>
>
> We have to restart openvswitch service everytime and that clears up all
> the processes. We are trying to figure out why so many processes are
> getting started by neutron-agent? Also, we found that if we restart the
> host's networking, one new process for the /bin/ovsdb-client starts. We
> checked and found that we don't have any network fluctuations or any
> nic-flappings. Are there any pointers where we should be looking into? It
> occurs on both controller and compute nodes.
>
>
> *Rahul Sharma*
> *MS in Computer Science, 2016*
> College of Computer and Information Science, Northeastern University
> Mobile:  801-706-7860
> Email: rahulsharma...@gmail.com
>
_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to