On 2 February 2017 at 13:34, Ihar Hrachyshka <ihrac...@redhat.com> wrote:
> The BadStatusLine error is well known: > https://bugs.launchpad.net/nova/+bug/1630664 That's the one! I knew it I had seen it in the past! > > > Now, it doesn't mean that the root cause of the error message is the > same, and it may as well be that lowering the number of workers > triggered it. All I am saying is we saw that error in the past. > > Ihar > > On Thu, Feb 2, 2017 at 1:07 PM, Kevin Benton <ke...@benton.pub> wrote: > > This error seems to be new in the ocata cycle. It's either related to a > > dependency change or the fact that we put Apache in between the services > > now. Handling more concurrent requests than workers wasn't an issue > before. > > > > It seems that you are suggesting that eventlet can't handle concurrent > > connections, which is the entire purpose of the library, no? > > > > On Feb 2, 2017 13:53, "Sean Dague" <s...@dague.net> wrote: > >> > >> On 02/02/2017 03:32 PM, Armando M. wrote: > >> > > >> > > >> > On 2 February 2017 at 12:19, Sean Dague <s...@dague.net > >> > <mailto:s...@dague.net>> wrote: > >> > > >> > On 02/02/2017 02:28 PM, Armando M. wrote: > >> > > > >> > > > >> > > On 2 February 2017 at 10:08, Sean Dague <s...@dague.net > >> > <mailto:s...@dague.net> > >> > > <mailto:s...@dague.net <mailto:s...@dague.net>>> wrote: > >> > > > >> > > On 02/02/2017 12:49 PM, Armando M. wrote: > >> > > > > >> > > > > >> > > > On 2 February 2017 at 08:40, Sean Dague <s...@dague.net > >> > <mailto:s...@dague.net> <mailto:s...@dague.net > >> > <mailto:s...@dague.net>> > >> > > > <mailto:s...@dague.net <mailto:s...@dague.net> > >> > <mailto:s...@dague.net <mailto:s...@dague.net>>>> wrote: > >> > > > > >> > > > On 02/02/2017 11:16 AM, Matthew Treinish wrote: > >> > > > <snip> > >> > > > > <oops, forgot to finish my though> > >> > > > > > >> > > > > We definitely aren't saying running a single worker > is > >> > how > >> > > we recommend people > >> > > > > run OpenStack by doing this. But it just adds on to > >> > the > >> > > differences between the > >> > > > > gate and what we expect things actually look like. > >> > > > > >> > > > I'm all for actually getting to the bottom of this, > but > >> > > honestly real > >> > > > memory profiling is needed here. The growth across > >> > projects > >> > > probably > >> > > > means that some common libraries are some part of > this. > >> > The > >> > > ever growing > >> > > > requirements list is demonstrative of that. Code reuse > >> > is > >> > > good, but if > >> > > > we are importing much of a library to get access to a > >> > couple of > >> > > > functions, we're going to take a bunch of memory > weight > >> > on that > >> > > > (especially if that library has friendly auto imports > in > >> > top level > >> > > > __init__.py so we can't get only the parts we want). > >> > > > > >> > > > Changing the worker count is just shuffling around > deck > >> > chairs. > >> > > > > >> > > > I'm not familiar enough with memory profiling tools in > >> > python > >> > > to know > >> > > > the right approach we should take there to get this > down > >> > to > >> > > individual > >> > > > libraries / objects that are containing all our > memory. > >> > Anyone > >> > > more > >> > > > skilled here able to help lead the way? > >> > > > > >> > > > > >> > > > From what I hear, the overall consensus on this matter is > to > >> > determine > >> > > > what actually caused the memory consumption bump and how > to > >> > > address it, > >> > > > but that's more of a medium to long term action. In fact, > to > >> > me > >> > > this is > >> > > > one of the top priority matters we should talk about at > the > >> > > imminent PTG. > >> > > > > >> > > > For the time being, and to provide relief to the gate, > >> > should we > >> > > want to > >> > > > lock the API_WORKERS to 1? I'll post something for review > >> > and see how > >> > > > many people shoot it down :) > >> > > > >> > > I don't think we want to do that. It's going to force down > the > >> > eventlet > >> > > API workers to being a single process, and it's not super > >> > clear that > >> > > eventlet handles backups on the inbound socket well. I > >> > honestly would > >> > > expect that creates different hard to debug issues, > especially > >> > with high > >> > > chatter rates between services. > >> > > > >> > > > >> > > I must admit I share your fear, but out of the tests that I have > >> > > executed so far in [1,2,3], the house didn't burn in a fire. I > am > >> > > looking for other ways to have a substantial memory saving with > a > >> > > relatively quick and dirty fix, but coming up empty handed thus > >> > far. > >> > > > >> > > [1] https://review.openstack.org/#/c/428303/ > >> > <https://review.openstack.org/#/c/428303/> > >> > > [2] https://review.openstack.org/#/c/427919/ > >> > <https://review.openstack.org/#/c/427919/> > >> > > [3] https://review.openstack.org/#/c/427921/ > >> > <https://review.openstack.org/#/c/427921/> > >> > > >> > This failure in the first patch - > >> > > >> > http://logs.openstack.org/03/428303/1/check/gate-tempest- > dsvm-neutron-full-ubuntu-xenial/71f42ea/logs/screen-n- > api.txt.gz?level=TRACE#_2017-02-02_19_14_11_751 > >> > > >> > <http://logs.openstack.org/03/428303/1/check/gate-tempest- > dsvm-neutron-full-ubuntu-xenial/71f42ea/logs/screen-n- > api.txt.gz?level=TRACE#_2017-02-02_19_14_11_751> > >> > looks exactly like I would expect by API Worker starvation. > >> > > >> > > >> > Not sure I agree on this one, this has been observed multiple times in > >> > the gate already [1] (though I am not sure there's a bug for it), and > I > >> > don't believe it has anything to do with the number of API workers, > >> > unless not even two workers are enough. > >> > >> There is no guarntee that 2 workers are enough. I'm not surprised if we > >> see that failure some today. This was all guess work on trimming worker > >> counts to deal with the memory issue in the past. But we're running > >> tests in parallel, and the services are making calls back to other > >> services all the time. > >> > >> This is one of the reasons to get the wsgi stack off of eventlet and > >> into a real webserver, as they handle HTTP request backups much much > >> better. > >> > >> I do understand that people want a quick fix here, but I'm not convinced > >> that it exists. > >> > >> -Sean > >> > >> -- > >> Sean Dague > >> http://dague.net > >> > >> ____________________________________________________________ > ______________ > >> OpenStack Development Mailing List (not for usage questions) > >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > ____________________________________________________________ > ______________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject: > unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev