+1

Testing done in OEL64*/KVM environment using a small foot print OEL64 VM image with userdata and reporting to central server to verify that VMs comes up properly. KVM hosts are 32 core / 256 GByte RAM / 1GBit / localstorage on fast RAID (8x600G;RAID5). Manager runs in VirtualBox with local MySQL.

- up hypervisor VM limit via API to 100 and do single hypervisor deployment of 80 VMs through 5 cycles (deploy/start/stop/destroy). 2.2 seconds per startVM API call average. 100% hit ratio.

- test concurrency of startVM operation, which was more or less non-existing in 41x; 8 KVM hosts, deploy 180 VMs in deploy/start/stop/destroy cycle. Depending on how many concurrent API calls are made the time per startVM call averages from 0.8 to 1.5 seconds during the start VM phase. During this test roughly one VM in 1000 fails to report back to logserver (they are reported as started in ACS MGR). This can be due to any reasons not cloud related but have not digged into this.

ACS42 could only recently pass this test due to what seems to be a leakage of ROOT images. E.g., after running the test for a while, a large number of ROOT images are left in expunging state, impossible to remove and they are not cleaned up after 24 hours. I have not seen this in the latest 4.2. This issue was non-deterministic and did not leave any log traces that obviously gave away the cause (have not enabled debug logging since that generates too much data during this type of testing).

ACS42 has significantly better concurrency compared to 411 and earlier. Roughly 10 times faster. The setup used in the testing is basic networking with security groups. For 411 the average startVM call times are in the 11+ seconds ball park, never lower. This figure does not change when adding more hosts. For this ACS42 test, the test image is small and quick to start so when adding more hosts, the gain eventually is zero since the java MGR process and MySQL on the manager head hits the CPU limits (for this test 4 cores and 8 GByte RAM was used, recent Intel Xeon CPUs).

Upped the MGR java memory to 4G during this test. The default 2G is too little. When hitting the setup with 160 concurrent API operations the resident set size of the java mgr process goes to 2.6 Gbyte. The stopVM operation is run at max concurrency. Other operations were run at 40 concurrent operations active (via throttling). Standard ACS42 API binding was used under python 2.7.5 (python 2.6 is not thread safe and does not work)

Having enough RAM in MGR for the test load is important. If RAM is not enough, odd things happens during test runs (failed API calls).

No new features tested. Just plain basic churn of the VM deploy/start/stop/destroy cycle.

/Ove

* OEL64 environment using updates done roughly a week ago

On 09/04/2013 06:43 AM, Animesh Chaturvedi wrote:


I've created a 4.2.0 release, with the following artifacts up for a
vote:

Git Branch and Commit SH:
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs/heads/4.2
Commit: e39a7d8e0d3f2fd3e326b1bdf4aaf9ba5d900b02

List of changes:
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob_plain;f=CHANGES;hb=4.2

Source release (checksums and signatures are available at the same
location):
https://dist.apache.org/repos/dist/dev/cloudstack/4.2.0/

PGP release keys (signed using 94BE0D7C):
https://dist.apache.org/repos/dist/release/cloudstack/KEYS

Testing instructions are here:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Release+test+procedure

Vote will be open for 72 hours (Friday, 9/6 10:00 PM PST).

For sanity in tallying the vote, can PMC members please be sure to indicate 
"(binding)" with their vote?

[ ] +1  approve
[ ] +0  no opinion
[ ] -1  disapprove (and reason why)


Thanks
Animesh


Reply via email to