Hi, > After trying to reproduce this, I'm suspecting that the issue is actually > on the server side from failing to drain the agent report state queue in > time.
I have seen before. I thought the senario at that time as follows. * a lot of create/update resource API issued * "rpc_conn_pool_size" pool exhausted for sending notify and blocked farther sending side of RPC. * "rpc_thread_pool_size" pool exhausted by waiting "rpc_conn_pool_size" pool for replying RPC. * receiving state_report is blocked because "rpc_thread_pool_size" pool exhausted. Thanks Itsuro Oda On Thu, 4 Jun 2015 14:20:33 -0700 Kevin Benton <blak...@gmail.com> wrote: > After trying to reproduce this, I'm suspecting that the issue is actually > on the server side from failing to drain the agent report state queue in > time. > > I set the report_interval to 1 second on the agent and added a logging > statement and I see a report every 1 second even when sync_routers is > taking a really long time. > > On Thu, Jun 4, 2015 at 11:52 AM, Carl Baldwin <c...@ecbaldwin.net> wrote: > > > Ann, > > > > Thanks for bringing this up. It has been on the shelf for a while now. > > > > Carl > > > > On Thu, Jun 4, 2015 at 8:54 AM, Salvatore Orlando <sorla...@nicira.com> > > wrote: > > > One reason for not sending the heartbeat from a separate greenthread > > could > > > be that the agent is already doing it [1]. > > > The current proposed patch addresses the issue blindly - that is to say > > > before declaring an agent dead let's wait for some more time because it > > > could be stuck doing stuff. In that case I would probably make the > > > multiplier (currently 2x) configurable. > > > > > > The reason for which state report does not occur is probably that both it > > > and the resync procedure are periodic tasks. If I got it right they're > > both > > > executed as eventlet greenthreads but one at a time. Perhaps then adding > > an > > > initial delay to the full sync task might ensure the first thing an agent > > > does when it comes up is sending a heartbeat to the server? > > > > > > On the other hand, while doing the initial full resync, is the agent > > able > > > to process updates? If not perhaps it makes sense to have it down until > > it > > > finishes synchronisation. > > > > Yes, it can! The agent prioritizes updates from RPC over full resync > > activities. > > > > I wonder if the agent should check how long it has been since its last > > state report each time it finishes processing an update for a router. > > It normally doesn't take very long (relatively) to process an update > > to a single router. > > > > I still would like to know why the thread to report state is being > > starved. Anyone have any insight on this? I thought that with all > > the system calls, the greenthreads would yield often. There must be > > something I don't understand about it. > > > > Carl > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > -- > Kevin Benton -- Itsuro ODA <o...@valinux.co.jp> __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev