On 2 February 2017 at 12:50, Sean Dague <s...@dague.net> wrote: > On 02/02/2017 03:32 PM, Armando M. wrote: > > > > > > On 2 February 2017 at 12:19, Sean Dague <s...@dague.net > > <mailto:s...@dague.net>> wrote: > > > > On 02/02/2017 02:28 PM, Armando M. wrote: > > > > > > > > > On 2 February 2017 at 10:08, Sean Dague <s...@dague.net <mailto: > s...@dague.net> > > > <mailto:s...@dague.net <mailto:s...@dague.net>>> wrote: > > > > > > On 02/02/2017 12:49 PM, Armando M. wrote: > > > > > > > > > > > > On 2 February 2017 at 08:40, Sean Dague <s...@dague.net > <mailto:s...@dague.net> <mailto:s...@dague.net > > <mailto:s...@dague.net>> > > > > <mailto:s...@dague.net <mailto:s...@dague.net> > > <mailto:s...@dague.net <mailto:s...@dague.net>>>> wrote: > > > > > > > > On 02/02/2017 11:16 AM, Matthew Treinish wrote: > > > > <snip> > > > > > <oops, forgot to finish my though> > > > > > > > > > > We definitely aren't saying running a single worker is > how > > > we recommend people > > > > > run OpenStack by doing this. But it just adds on to the > > > differences between the > > > > > gate and what we expect things actually look like. > > > > > > > > I'm all for actually getting to the bottom of this, but > > > honestly real > > > > memory profiling is needed here. The growth across > projects > > > probably > > > > means that some common libraries are some part of this. > The > > > ever growing > > > > requirements list is demonstrative of that. Code reuse is > > > good, but if > > > > we are importing much of a library to get access to a > > couple of > > > > functions, we're going to take a bunch of memory weight > > on that > > > > (especially if that library has friendly auto imports in > > top level > > > > __init__.py so we can't get only the parts we want). > > > > > > > > Changing the worker count is just shuffling around deck > > chairs. > > > > > > > > I'm not familiar enough with memory profiling tools in > > python > > > to know > > > > the right approach we should take there to get this down > to > > > individual > > > > libraries / objects that are containing all our memory. > > Anyone > > > more > > > > skilled here able to help lead the way? > > > > > > > > > > > > From what I hear, the overall consensus on this matter is to > > determine > > > > what actually caused the memory consumption bump and how to > > > address it, > > > > but that's more of a medium to long term action. In fact, to > me > > > this is > > > > one of the top priority matters we should talk about at the > > > imminent PTG. > > > > > > > > For the time being, and to provide relief to the gate, > should we > > > want to > > > > lock the API_WORKERS to 1? I'll post something for review > > and see how > > > > many people shoot it down :) > > > > > > I don't think we want to do that. It's going to force down the > > eventlet > > > API workers to being a single process, and it's not super > > clear that > > > eventlet handles backups on the inbound socket well. I > > honestly would > > > expect that creates different hard to debug issues, especially > > with high > > > chatter rates between services. > > > > > > > > > I must admit I share your fear, but out of the tests that I have > > > executed so far in [1,2,3], the house didn't burn in a fire. I am > > > looking for other ways to have a substantial memory saving with a > > > relatively quick and dirty fix, but coming up empty handed thus > far. > > > > > > [1] https://review.openstack.org/#/c/428303/ > > <https://review.openstack.org/#/c/428303/> > > > [2] https://review.openstack.org/#/c/427919/ > > <https://review.openstack.org/#/c/427919/> > > > [3] https://review.openstack.org/#/c/427921/ > > <https://review.openstack.org/#/c/427921/> > > > > This failure in the first patch - > > http://logs.openstack.org/03/428303/1/check/gate-tempest- > dsvm-neutron-full-ubuntu-xenial/71f42ea/logs/screen-n- > api.txt.gz?level=TRACE#_2017-02-02_19_14_11_751 > > <http://logs.openstack.org/03/428303/1/check/gate-tempest- > dsvm-neutron-full-ubuntu-xenial/71f42ea/logs/screen-n- > api.txt.gz?level=TRACE#_2017-02-02_19_14_11_751> > > looks exactly like I would expect by API Worker starvation. > > > > > > Not sure I agree on this one, this has been observed multiple times in > > the gate already [1] (though I am not sure there's a bug for it), and I > > don't believe it has anything to do with the number of API workers, > > unless not even two workers are enough. > > There is no guarntee that 2 workers are enough. I'm not surprised if we > see that failure some today. This was all guess work on trimming worker > counts to deal with the memory issue in the past. But we're running > tests in parallel, and the services are making calls back to other > services all the time. > > This is one of the reasons to get the wsgi stack off of eventlet and > into a real webserver, as they handle HTTP request backups much much > better. > > I do understand that people want a quick fix here, but I'm not convinced > that it exists. >
Fair enough. The main intent of this conversation for me was to spur debate and gather opinions. So long as we agree that fixing memory hunger is a concerted effort and that we cannot let one service go on a diet whereas another goes binge eat, I am ok limping along for as long as it takes to bring things back into shape. > > -Sean > > -- > Sean Dague > http://dague.net > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev