Re: [openstack-dev] [QA][gate][all] dsvm gate stability and scenario tests

Morales, Victor Fri, 17 Mar 2017 08:33:48 -0700

Well my crazy idea was the addition[10] of an extra argument(—men-trace) on the 
pbr binary creation.  The idea is to be able to use it from any openstack 
binary and print those methods that are differences in the memory 
consumption[11].


Regards/Saludos
Victor Morales
irc: electrocucaracha

[10] https://review.openstack.org/#/c/433947/
[11] http://paste.openstack.org/show/599087/



From:  Jordan Pittier <[email protected]>
Reply-To:  "OpenStack Development Mailing List (not for usage questions)" 
<[email protected]>
Date:  Friday, March 17, 2017 at 7:27 AM
To:  "OpenStack Development Mailing List (not for usage questions)" 
<[email protected]>
Subject:  Re: [openstack-dev] [QA][gate][all] dsvm gate stability and   
scenario tests


The patch that reduced the number of Tempest Scenarios we run in every job and 
also reduce the test run concurrency [0] was merged 13 days ago. Since, the 
situation (i.e the high number of false negative job results) has not improved 
significantly. We
 need to keep looking collectively at this.


There seems to be an agreement that we are hitting some memory limit. Several 
of our most frequent failures are memory related [1]. So we should either 
reduce our memory usage or ask for bigger VMs, with more than 8GB of RAM.


There was/is several attempts to reduce our memory usage, by reducing the Mysql 
memory consumption ([2] but quickly reverted [3]), reducing the number of 
Apache workers ([4], [5]), more apache2 tuning [6]. If you have any crazy idea 
to help in this regard,
 please help. This is high priority for the whole openstack project, because 
it's plaguing many projects.


We have some tools to investigate memory consumption, like some regular "dstat" 
output [7], a home-made memory tracker [8] and stackviz [9].


Best,

Jordan

[0]: https://review.openstack.org/#/c/439698/
[1]: http://status.openstack.org/elastic-recheck/gate.html
[2] : https://review.openstack.org/#/c/438668/
[3]: https://review.openstack.org/#/c/446196/
[4]: https://review.openstack.org/#/c/426264/
[5]: https://review.openstack.org/#/c/445910/
[6]: https://review.openstack.org/#/c/446741/
[7]: 
http://logs.openstack.org/96/446196/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b5c362f/logs/dstat-csv_log.txt.gz
 
<http://logs.openstack.org/96/446196/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b5c362f/logs/dstat-csv_log.txt.gz>
[8]: 
http://logs.openstack.org/96/446196/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b5c362f/logs/screen-peakmem_tracker.txt.gz
 
<http://logs.openstack.org/96/446196/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b5c362f/logs/screen-peakmem_tracker.txt.gz>
[9] : 
http://logs.openstack.org/41/446741/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/fa4d2e6/logs/stackviz/#/stdin/timeline
 
<http://logs.openstack.org/41/446741/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/fa4d2e6/logs/stackviz/#/stdin/timeline>







On Sat, Mar 4, 2017 at 4:19 PM, Andrea Frittoli 
<[email protected]> wrote:

Quick update on this, the change is now merged, so we now have a smaller number 
of scenario tests running serially after the api test run.
We'll monitor gate stability for the next week or so and decide whether further 
actions are required.
Please keep categorizing failures via elastic recheck as usual.
thank you
andrea

On Fri, 3 Mar 2017, 8:02 a.m. Ghanshyam Mann, <[email protected]> wrote:


Thanks. +1. i added my list in ethercalc.

Left put scenario tests can be run on periodic and experimental job. IMO on 
both ( periodic and experimental) to monitor their status periodically as well 
as on particular patch if we need to. 

-gmann






On Fri, Mar 3, 2017 at 4:28 PM, Andrea Frittoli
<[email protected]> wrote:




Hello folks,

we discussed a lot since the PTG about issues with gate stability; we need a 
stable and reliable gate to ensure smooth progress in Pike.

One of the issues that stands out is that most of the times during test runs 
our test VMs are under heavy load.
This can be the common cause behind several failures we've seen in the gate, so 
we agreed during the QA meeting yesterday [0] that we're going to try reducing 
the load and see whether that improves stability.


Next steps are:

- select a subset of scenario tests to be executed in the gate, based on [1], 
and run them serially only 
- the patch for this is [2] and we will approve this by the end of the day
- we will monitor stability for a week - if needed we may reduce concurrency a 
bit on API tests as well, and identify "heavy" tests candidate for removal / 
refactor
- the QA team won't approve any new test (scenario or heavy resource consuming 
api) until gate stability is ensured 

Thanks for your patience and collaboration!

Andrea

---
irc: andreaf

[0] http://eavesdrop.openstack.org/meetings/qa/2017/qa.2017-03-02-17.00.txt
[1] https://ethercalc.openstack.org/nu56u2wrfb2b
[2] https://review.openstack.org/#/c/439698/








__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
[email protected]?subject:unsubscribe 
<http://[email protected]?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
[email protected]?subject:unsubscribe 
<http://[email protected]?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
[email protected]?subject:unsubscribe 
<http://[email protected]?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [QA][gate][all] dsvm gate stability and scenario tests

Reply via email to