Michael, I’m ok with your analysis, and suggested fix.
From the original test output, in the bug description, I can see that there are 58 println’s with "Request from:" for test3, and two "Worker: Error writing to server”. This would tend to support your analysis that that server, in some cases, is not accepting the barrage of requests. Please provide a webrev/changeset and I will sponsor the change for you. -Chris. On 20 Feb 2014, at 08:25, michael cui <michael....@oracle.com> wrote: > Hi, > > I would like to discuss my current root cause analysis of JDK-7052625 : > com/sun/net/httpserver/bugs/6725892/Test.java fails intermittently > > As JDK-6725892 <https://bugs.openjdk.java.net/browse/JDK-6725892> stated, the > purpose of this regression test is testing bad http connections can be > handled correctly which including > + send no request > + send an incomplete request > + fail to read the response completely. > > test3() method will start 20 threads for each type listed above at same time. > So totally 60 threads started in test3(). Each thread will open connection to > httpserver and simulate the normal or bad http request to see if http server > can handle them correctly. (20 threads for incomplete read, 20 threads for > incomplete write, 20 threads for read/write normal case) > > Those threads will be started at same time. Among them, 40 threads using > sleep to simulate bad request. > > The http server created by the following api call : > s1 = HttpServer.create (addr, 0); > > According API doc > <http://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#ServerSocket%28int%29> > and ServerSocket.java source code, the second parameter is backlog of socket > which is the maximum number of queued incoming connections to allow on the > listening socket. Queued TCP connections exceeding this limit may be rejected > by the TCP implementation.. The default value 50 will be used if it was set > to zero (See api doc > <http://docs.oracle.com/javase/7/docs/api/java/net/ServerSocket.html#ServerSocket%28int%29> > and ServerSocket.java )). > > Since in test3(), 40 threads out of total 60 threads will simulate bad http > request by sleeping either at reading or writing, there could be a very > little possibility that httpserver 's socket connection queue reach his limit > (50 for default value) and some tcp connection will be rest at that situation. > > This could be the root cause of this intermittently failure. > > Test result of the original version : > 0 failure on Linux for 10000 runs. > 0 failure on solaris for 10000 runs. > 6 failure on windows for 10000 runs > 28 failures on mac for 10000 runs > > By increasing the thread number of bad request, we can observe that the > frequency of failure will be increased. > > Test result of fix version in which backlog of httpserver was changed from 0 > to 100. > 0 failure on Linux for 10000 runs. > 0 failure on solaris for 10000 runs. > 0 failure on windows for 10000 runs > 0 failures on mac for 10000 runs > > It seems to me that using default 0 for backlog of httpserver could be root > cause of this intermittently failure. > Are we comfortable with this analysis? If it is the root cause, could setting > backlog as 100 be a suggest fix? > > Thanks, > Michael Cui