On 05/06/2014 10:42 PM, Roman Sokolkov wrote: > Hello, fuelers. > > I'm using Fuel 4.1A + Havana in HA mode. > > I permanently observe (on other deployments also) issue with stuck > "nova-compute" service. But i think problem is more fundamental and > relates to HA RabbitMQ and OpenStack AMQP driver implementation. > > Symptoms: > > * Random nova-compute from time to time marked as "XXX" for a while. > * I see that service itself works properly. In logs i see that it > sends status updates to conductor. But actually nothing is sent. > * "netstat" shows that all connections to/from rabbit "ESTABLISHED" > * rabbitmqctl shows that "compute.node-x" queue synced to all slaves. > * nothing has been broken before, i mean rabbitmq cluster, etc. > > Axe style solution: > > * /etc/init.d/openstack-nova-compute restart > > So here i've found a lot of interesting stuff (and solutions): > > https://bugs.launchpad.net/oslo.messaging/+bug/856764 > > > My questions are: > > * Are there any thoughts particular for Fuel to solve/workaround this > issue? > * Any fast solution for this in 4.1? Like adjust TCP keep-alive timeouts?
Perhaps, the soultion is to apply https://review.openstack.org/#/c/34949 and check results with rabbitmq and nova. If it is OK, we could submit a task for OSCI team to patch our internal repos and update 4.1.1 / 5.0 targeted MOS packages. > > > -- > Roman Sokolkov, > Deployment Engineer, > Mirantis, Inc. > Skype rsokolkov, > rsokol...@mirantis.com <mailto:rsokol...@mirantis.com> > > -- Best regards, Bogdan Dobrelya, Skype #bogdando_at_yahoo.com Irc #bogdando _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev