On Dec 4, 2013, at 8:55 AM, Carl Baldwin <c...@ecbaldwin.net> wrote:
> Stephen, all, > > I agree that there may be some opportunity to split things out a bit. > However, I'm not sure what the best way will be. I recall that Mark > mentioned breaking out the processes that handle API requests and RPC > from each other at the summit. Anyway, it is something that has been > discussed. > > I actually wanted to point out that the neutron server now has the > ability to run a configurable number of sub-processes to handle a > heavier load. Introduced with this commit: > > https://review.openstack.org/#/c/37131/ > > Set api_workers to something > 1 and restart the server. > > The server can also be run on more than one physical host in > combination with multiple child processes. I completely misunderstood the import of the commit in question. Being able to run the wsgi server(s) out of process is a nice improvement, thank you for making it happen. Has there been any discussion around making the default for api_workers > 0 (at least 1) to ensure that the default configuration separates wsgi and rpc load? This also seems like a great candidate for backporting to havana and maybe even grizzly, although api_workers should probably be defaulted to 0 in those cases. FYI, I re-ran the test that attempted to boot 75 micro VM's simultaneously with api_workers = 2, with mixed results. The increased wsgi throughput resulted in almost half of the boot requests failing with 500 errors due to QueuePool errors (https://bugs.launchpad.net/neutron/+bug/1160442) in Neutron. It also appears that maximizing the number of wsgi requests has the side-effect of increasing the RPC load on the main process, and this means that the problem of dhcp notifications being dropped is little improved. I intend to submit a fix that ensures that notifications are sent regardless of agent status, in any case. m. > > Carl > > On Tue, Dec 3, 2013 at 9:47 AM, Stephen Gran > <stephen.g...@theguardian.com> wrote: >> On 03/12/13 16:08, Maru Newby wrote: >>> >>> I've been investigating a bug that is preventing VM's from receiving IP >>> addresses when a Neutron service is under high load: >>> >>> https://bugs.launchpad.net/neutron/+bug/1192381 >>> >>> High load causes the DHCP agent's status updates to be delayed, causing >>> the Neutron service to assume that the agent is down. This results in the >>> Neutron service not sending notifications of port addition to the DHCP >>> agent. At present, the notifications are simply dropped. A simple fix is >>> to send notifications regardless of agent status. Does anybody have any >>> objections to this stop-gap approach? I'm not clear on the implications of >>> sending notifications to agents that are down, but I'm hoping for a simple >>> fix that can be backported to both havana and grizzly (yes, this bug has >>> been with us that long). >>> >>> Fixing this problem for real, though, will likely be more involved. The >>> proposal to replace the current wsgi framework with Pecan may increase the >>> Neutron service's scalability, but should we continue to use a 'fire and >>> forget' approach to notification? Being able to track the success or >>> failure of a given action outside of the logs would seem pretty important, >>> and allow for more effective coordination with Nova than is currently >>> possible. >> >> >> It strikes me that we ask an awful lot of a single neutron-server instance - >> it has to take state updates from all the agents, it has to do scheduling, >> it has to respond to API requests, and it has to communicate about actual >> changes with the agents. >> >> Maybe breaking some of these out the way nova has a scheduler and a >> conductor and so on might be a good model (I know there are things people >> are unhappy about with nova-scheduler, but imagine how much worse it would >> be if it was built into the API). >> >> Doing all of those tasks, and doing it largely single threaded, is just >> asking for overload. >> >> Cheers, >> -- >> Stephen Gran >> Senior Systems Integrator - theguardian.com >> Please consider the environment before printing this email. >> ------------------------------------------------------------------ >> Visit theguardian.com >> On your mobile, download the Guardian iPhone app theguardian.com/iphone and >> our iPad edition theguardian.com/iPad Save up to 33% by subscribing to the >> Guardian and Observer - choose the papers you want and get full digital >> access. >> Visit subscribe.theguardian.com >> >> This e-mail and all attachments are confidential and may also >> be privileged. If you are not the named recipient, please notify >> the sender and delete the e-mail and all attachments immediately. >> Do not disclose the contents to another person. You may not use >> the information for any purpose, or store, or copy, it in any way. >> >> Guardian News & Media Limited is not liable for any computer >> viruses or other material transmitted with or as part of this >> e-mail. You should employ virus checking software. >> >> Guardian News & Media Limited >> >> A member of Guardian Media Group plc >> Registered Office >> PO Box 68164 >> Kings Place >> 90 York Way >> London >> N1P 2AP >> >> Registered in England Number 908396 >> >> -------------------------------------------------------------------------- >> >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev