At Wed, 1 Feb 2017 17:37:34 -0700, Kevin Benton wrote: > > [1 <multipart/alternative (7bit)>] > [1.1 <text/plain; UTF-8 (7bit)>] > And who said openstack wasn't growing? ;) > > I think reducing API workers is a nice quick way to bring back some > stability. > > I have spent a bunch of time digging into the OOM killer events and haven't > yet figured out why they are being triggered. There is significant swap > space remaining in all of the cases I have seen so it's likely some memory
We can try increasing watermark_scale_factor instead. I looked at random 2 oom-killer invocations but free mem were above watermark. oom-killer were triggered by 16kB contig page alloation by apparmor_file_alloc_security, so if we can try disabling apparmor that may also work. > locking issue or kernel allocations blocking swap. Until we can figure out > the cause, we effectively have no usable swap space on the test instances > so we are limited to 8GB. > > On Feb 1, 2017 17:27, "Armando M." <arma...@gmail.com> wrote: > > > Hi, > > > > [TL;DR]: OpenStack services have steadily increased their memory > > footprints. We need a concerted way to address the oom-kills experienced in > > the openstack gate, as we may have reached a ceiling. > > > > Now the longer version: > > -------------------------------- > > > > We have been experiencing some instability in the gate lately due to a > > number of reasons. When everything adds up, this means it's rather > > difficult to merge anything and knowing we're in feature freeze, that adds > > to stress. One culprit was identified to be [1]. > > > > We initially tried to increase the swappiness, but that didn't seem to > > help. Then we have looked at the resident memory in use. When going back > > over the past three releases we have noticed that the aggregated memory > > footprint of some openstack projects has grown steadily. We have the > > following: > > > > - Mitaka > > - neutron: 1.40GB > > - nova: 1.70GB > > - swift: 640MB > > - cinder: 730MB > > - keystone: 760MB > > - horizon: 17MB > > - glance: 538MB > > - Newton > > - neutron: 1.59GB (+13%) > > - nova: 1.67GB (-1%) > > - swift: 779MB (+21%) > > - cinder: 878MB (+20%) > > - keystone: 919MB (+20%) > > - horizon: 21MB (+23%) > > - glance: 721MB (+34%) > > - Ocata > > - neutron: 1.75GB (+10%) > > - nova: 1.95GB (%16%) > > - swift: 703MB (-9%) > > - cinder: 920MB (4%) > > - keystone: 903MB (-1%) > > - horizon: 25MB (+20%) > > - glance: 740MB (+2%) > > > > Numbers are approximated and I only took a couple of samples, but in a > > nutshell, the majority of the services have seen double digit growth over > > the past two cycles in terms of the amount or RSS memory they use. > > > > Since [1] is observed only since ocata [2], I imagine that's pretty > > reasonable to assume that memory increase might as well be a determining > > factor to the oom-kills we see in the gate. > > > > Profiling and surgically reducing the memory used by each component in > > each service is a lengthy process, but I'd rather see some gate relief > > right away. Reducing the number of API workers helps bring the RSS memory > > down back to mitaka levels: > > > > - neutron: 1.54GB > > - nova: 1.24GB > > - swift: 694MB > > - cinder: 778MB > > - keystone: 891MB > > - horizon: 24MB > > - glance: 490MB > > > > However, it may have other side effects, like longer execution times, or > > increase of timeouts. > > > > Where do we go from here? I am not particularly fond of stop-gap [4], but > > it is the one fix that most widely address the memory increase we have > > experienced across the board. > > > > Thanks, > > Armando > > > > [1] https://bugs.launchpad.net/neutron/+bug/1656386 > > [2] http://logstash.openstack.org/#/dashboard/file/logstash. > > json?query=message:%5C%22oom-killer%5C%22%20AND%20tags:syslog > > [3] http://logs.openstack.org/21/427921/1/check/gate- > > tempest-dsvm-neutron-full-ubuntu-xenial/82084c2/ > > [4] https://review.openstack.org/#/c/427921 > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > [1.2 <text/html; UTF-8 (quoted-printable)>] > [2 <text/plain; us-ascii (7bit)>] > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev