On 02/20/2014 07:24 PM, Chris Hegarty wrote:
Michael,
I’m ok with your analysis, and suggested fix.
From the original test output, in the bug description, I can see that there are 58 println’s
with "Request from:" for test3, and two "Worker: Error writing to server”. This
would tend to support your analysis that that server, in some cases, is not accepting the
barrage of requests.
Please provide a webrev/changeset and I will sponsor the change for you.
Thank you very much on the review and sponsor!
webrev link :
http://cr.openjdk.java.net/~tyan/michael/JDK-7052625/webrev.00/
<http://cr.openjdk.java.net/%7Etyan/michael/JDK-7052625/webrev.00/>
-Chris.
On 20 Feb 2014, at 08:25, michael cui <michael....@oracle.com> wrote:
Hi,
I would like to discuss my current root cause analysis of JDK-7052625 :
com/sun/net/httpserver/bugs/6725892/Test.java fails intermittently
As JDK-6725892 <https://bugs.openjdk.java.net/browse/JDK-6725892> stated, the
purpose of this regression test is testing bad http connections can be handled
correctly which including
+ send no request
+ send an incomplete request
+ fail to read the response completely.
test3() method will start 20 threads for each type listed above at same time.
So totally 60 threads started in test3(). Each thread will open connection to
httpserver and simulate the normal or bad http request to see if http server
can handle them correctly. (20 threads for incomplete read, 20 threads for
incomplete write, 20 threads for read/write normal case)
Those threads will be started at same time. Among them, 40 threads using sleep
to simulate bad request.
The http server created by the following api call :
s1 = HttpServer.create (addr, 0);
According API doc
<http://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#ServerSocket%28int%29>
and ServerSocket.java source code, the second parameter is backlog of socket which is the
maximum number of queued incoming connections to allow on the listening socket. Queued TCP
connections exceeding this limit may be rejected by the TCP implementation.. The default
value 50 will be used if it was set to zero (See api doc
<http://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#ServerSocket%28int%29>
and ServerSocket.java )).
Since in test3(), 40 threads out of total 60 threads will simulate bad http
request by sleeping either at reading or writing, there could be a very little
possibility that httpserver 's socket connection queue reach his limit (50 for
default value) and some tcp connection will be rest at that situation.
This could be the root cause of this intermittently failure.
Test result of the original version :
0 failure on Linux for 10000 runs.
0 failure on solaris for 10000 runs.
6 failure on windows for 10000 runs
28 failures on mac for 10000 runs
By increasing the thread number of bad request, we can observe that the
frequency of failure will be increased.
Test result of fix version in which backlog of httpserver was changed from 0 to
100.
0 failure on Linux for 10000 runs.
0 failure on solaris for 10000 runs.
0 failure on windows for 10000 runs
0 failures on mac for 10000 runs
It seems to me that using default 0 for backlog of httpserver could be root
cause of this intermittently failure.
Are we comfortable with this analysis? If it is the root cause, could setting
backlog as 100 be a suggest fix?
Thanks,
Michael Cui
Michael Cui