+1
Testing done in OEL64*/KVM environment using a small foot print OEL64 VM
image with userdata and reporting to central server to verify that VMs
comes up properly. KVM hosts are 32 core / 256 GByte RAM / 1GBit /
localstorage on fast RAID (8x600G;RAID5). Manager runs in VirtualBox
with local MySQL.
- up hypervisor VM limit via API to 100 and do single hypervisor
deployment of 80 VMs through 5 cycles (deploy/start/stop/destroy). 2.2
seconds per startVM API call average. 100% hit ratio.
- test concurrency of startVM operation, which was more or less
non-existing in 41x; 8 KVM hosts, deploy 180 VMs in
deploy/start/stop/destroy cycle. Depending on how many concurrent API
calls are made the time per startVM call averages from 0.8 to 1.5
seconds during the start VM phase. During this test roughly one VM in
1000 fails to report back to logserver (they are reported as started in
ACS MGR). This can be due to any reasons not cloud related but have not
digged into this.
ACS42 could only recently pass this test due to what seems to be a
leakage of ROOT images. E.g., after running the test for a while, a
large number of ROOT images are left in expunging state, impossible to
remove and they are not cleaned up after 24 hours. I have not seen this
in the latest 4.2. This issue was non-deterministic and did not leave
any log traces that obviously gave away the cause (have not enabled
debug logging since that generates too much data during this type of
testing).
ACS42 has significantly better concurrency compared to 411 and earlier.
Roughly 10 times faster. The setup used in the testing is basic
networking with security groups. For 411 the average startVM call times
are in the 11+ seconds ball park, never lower. This figure does not
change when adding more hosts. For this ACS42 test, the test image is
small and quick to start so when adding more hosts, the gain eventually
is zero since the java MGR process and MySQL on the manager head hits
the CPU limits (for this test 4 cores and 8 GByte RAM was used, recent
Intel Xeon CPUs).
Upped the MGR java memory to 4G during this test. The default 2G is too
little. When hitting the setup with 160 concurrent API operations the
resident set size of the java mgr process goes to 2.6 Gbyte. The stopVM
operation is run at max concurrency. Other operations were run at 40
concurrent operations active (via throttling). Standard ACS42 API
binding was used under python 2.7.5 (python 2.6 is not thread safe and
does not work)
Having enough RAM in MGR for the test load is important. If RAM is not
enough, odd things happens during test runs (failed API calls).
No new features tested. Just plain basic churn of the VM
deploy/start/stop/destroy cycle.
/Ove
* OEL64 environment using updates done roughly a week ago
On 09/04/2013 06:43 AM, Animesh Chaturvedi wrote:
I've created a 4.2.0 release, with the following artifacts up for a
vote:
Git Branch and Commit SH:
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=shortlog;h=refs/heads/4.2
Commit: e39a7d8e0d3f2fd3e326b1bdf4aaf9ba5d900b02
List of changes:
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob_plain;f=CHANGES;hb=4.2
Source release (checksums and signatures are available at the same
location):
https://dist.apache.org/repos/dist/dev/cloudstack/4.2.0/
PGP release keys (signed using 94BE0D7C):
https://dist.apache.org/repos/dist/release/cloudstack/KEYS
Testing instructions are here:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Release+test+procedure
Vote will be open for 72 hours (Friday, 9/6 10:00 PM PST).
For sanity in tallying the vote, can PMC members please be sure to indicate
"(binding)" with their vote?
[ ] +1 approve
[ ] +0 no opinion
[ ] -1 disapprove (and reason why)
Thanks
Animesh