Sounds like a question for net-dev more than core-libs - cc'd.
David
On 20/02/2014 4:11 PM, michael cui wrote:
On 02/18/2014 12:51 AM, michael cui wrote:
Hi,
I would like to discuss my current root cause analysis of JDK-7052625
: com/sun/net/httpserver/bugs/6725892/Test.java fails intermittently
As JDK-6725892 <https://bugs.openjdk.java.net/browse/JDK-6725892>
stated, the purpose of this regression test is testing bad http
connections can be handled correctly which including
+ send no request
+ send an incomplete request
+ fail to read the response completely.
test3() method will start 20 threads for each type listed above at
same time. So totally 60 threads started in test3(). Each thread will
open connection to httpserver and simulate the normal or bad http
request to see if http server can handle them correctly. (20 threads
for incomplete read, 20 threads for incomplete write, 20 threads for
read/write normal case)
Those threads will be started at same time. Among them, 40 threads
using sleep to simulate bad request.
The http server created by the following api call :
s1 = HttpServer.create (addr, 0);
According API doc
<http://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#ServerSocket%28int%29>
and ServerSocket.java source code, the second parameter is backlog of
socket which is the maximum number of queued incoming connections to
allow on the listening socket. Queued TCP connections exceeding this
limit may be rejected by the TCP implementation.. The default value 50
will be used if it was set to zero (See api doc
<http://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#ServerSocket%28int%29>
and ServerSocket.java )).
Since in test3(), 40 threads out of total 60 threads will simulate bad
http request by sleeping either at reading or writing, there could be
a very little possibility that httpserver 's socket connection queue
reach his limit (50 for default value) and some tcp connection will be
rest at that situation.
This could be the root cause of this intermittently failure.
Test result of the original version :
0 failure on Linux for 10000 runs.
0 failure on solaris for 10000 runs.
6 failure on windows for 10000 runs
28 failures on mac for 10000 runs
By increasing the thread number of bad request, we can observe that
the frequency of failure will be increased.
Test result of fix version in which backlog of httpserver was changed
from 0 to 100.
0 failure on Linux for 10000 runs.
0 failure on solaris for 10000 runs.
0 failure on windows for 10000 runs
0 failures on mac for 10000 runs
It seems to me that using default 0 for backlog of httpserver could be
root cause of this intermittently failure.
Are we comfortable with this analysis? If it is the root cause, could
setting backlog as 100 be a suggest fix?
Thanks,
Michael Cui
Could anyone provide some insight on this analysis?
Michael Cui