Re: Quick performance deterioration when No. of clients increases

Nikolaos Milas Wed, 16 Oct 2013 03:34:09 -0700

On 14/10/2013 5:47 μμ, Toni Mueller wrote:

did you investigate disk I/O?


Hi again,

Thanks for your suggestions (see below on that).

In the meantime, we have increased CPU power to 4 cores and the behaviorof the server is much better.

I found that the server performance was reaching a bottleneck (byphp-fpm) by NOT using microcache, because most pages were returningcodes 303 502 (and these return codes were not included infastcgi_cache_valid by default). When I set:


   fastcgi_cache_valid 200 301 302 303 502 3s;

then I saw immediate performance gains and drop to unix load down toalmost 0 (from 100 - not a typo -) during load.

I used iostat during a load test and I didn't see any serious stress onI/O. The worst (max load) recorded entry is:


==========================================================================================================
avg-cpu: %user %nice %system %iowait %steal %idle
85.43 0.00 12.96 0.38 0.00 1.23

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz awaitsvctm %util

vda 0.00 136.50 0.00 21.20 0.00 1260.00 59.43 1.15 54.25 3.92 8.30
dm-0 0.00 0.00 0.00 157.50 0.00 1260.00 8.00 13.39 85.04 0.53 8.29
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
==========================================================================================================

Can you see a serious problem here? (I am not an expert, but, judgingfrom what I've read on the Internet, it should not be bad.)

Now my problem is that there seems to be a limit of performance toaround 1200 req/sec (which is not too bad, anyway), although CPU andmemory is ample during all test. Increasing stress load more than that(I am using tsung for load testing), results only to increasing"error_connect_emfile" errors.

See results of a test attached. (100 users arriving per second for 5minutes (with max 10000 users), each of them hitting the homepage 100times. Details of the test at the bottom of this mail.)

My research showed that this should be a result of file descriptorexhaustion, however I could not find the root cause. The following seem OK:


# cat /proc/sys/fs/file-max
592940
# ulimit -n
200000
# ulimit -Hn
200000
# ulimit -Sn
200000
# grep nofile /etc/security/limits.conf
* - nofile 200000

Could you please guide me on how to resolve this issue? What is the realbottleneck here and how to overcome?

My config remains as was initially posted (it can also be seen here:https://www.ruby-forum.com/topic/4417776), with the difference of:"worker_processes 4" (since we now have 4 CPU cores).


Please advise.

============================= tsung.xml <start>=============================


<?xml version="1.0"?>
<!DOCTYPE tsung SYSTEM "/usr/share/tsung/tsung-1.0.dtd">

<tsung loglevel="debug" dumptraffic="false" version="1.0">

<clients>
<client host="localhost" use_controller_vm="true" maxusers="10000"/>
</clients>

<servers>
<server host="www.example.com" port="80" type="tcp"></server>
</servers>

<load duration="5" unit="minute">
<arrivalphase phase="1" duration="5" unit="minute">
<users arrivalrate="100" unit="second"/>
</arrivalphase>
</load>

<sessions>
<session probability="100" name="hit_en_homepage" type="ts_http">
<for from="1" to="100" var="i">
<request><http url='/' version='1.1' method='GET'></http></request>
<thinktime random='true' value='1'/>
</for>
</session>
</sessions>

</tsung>

============================== tsung.xml <end>===============================


Thanks and Regards,
Nick

<<attachment: graphes-Perfs-rate_tn.png>>

<<attachment: graphes-Users_Arrival-rate_tn.png>>

<<attachment: graphes-Users-simultaneous_tn.png>>

<<attachment: graphes-Errors-rate_tn.png>>

<<attachment: graphes-Perfs-mean_tn.png>>

_______________________________________________
nginx mailing list
nginx@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx

Re: Quick performance deterioration when No. of clients increases

Reply via email to