On 13/01/2010, sebb <seb...@gmail.com> wrote:
> On 13/01/2010, sebb <seb...@gmail.com> wrote:
>  > Here is the output from
>  >  
> http://vmbuild.apache.org/continuum/buildResult.action?buildId=268733&projectId=22
>  >
>  >  I've rearranged it so the threads are in ConnectionStart order.
>  >  I moved ConnectionStart to the beginning, added the Thread number at
>  >  the end (dropping irrelevant stuff), and deleted the prefix
>  >  "ConnectStart: 355958"
>  >
>  >  7635. StartupDelay:  13. ConnectTime:   2. Runtime: 218.  1
>  >  7637. StartupDelay:   0. ConnectTime: 280. Runtime: 481.  2
>  >  7724. StartupDelay:   0. ConnectTime: 193. Runtime: 394.  4
>  >  7737. StartupDelay:  13. ConnectTime: 180. Runtime: 381.  5
>  >  7750. StartupDelay: 112. ConnectTime: 167. Runtime: 368.  3
>  >  7775. StartupDelay:  38. ConnectTime: 142. Runtime: 343.  6
>  >  7785. StartupDelay:  10. ConnectTime: 132. Runtime: 333.  7
>  >  7816. StartupDelay:  31. ConnectTime: 101. Runtime: 302.  8
>  >  7853. StartupDelay:  38. ConnectTime:   1. Runtime: 202.  9
>  >  7865. StartupDelay:  11. ConnectTime:  52. Runtime: 252. 10
>  >  7874. StartupDelay:   9. ConnectTime:  43. Runtime: 243. 11
>  >  7918. StartupDelay:  44. ConnectTime:   -. Runtime: 109. 12
>  >  7927. StartupDelay:   9. ConnectTime:   -. Runtime: 113. 13
>  >  7941. StartupDelay:  14. ConnectTime:   -. Runtime: 104. 14
>  >  7944. StartupDelay:   3. ConnectTime:   -. Runtime: 101. 15
>  >  7953. StartupDelay:   0. ConnectTime:   -. Runtime: 100. 17
>  >  7953. StartupDelay:   0. ConnectTime:   -. Runtime: 100. 18
>  >  7953. StartupDelay:   0. ConnectTime:   -. Runtime: 101. 19
>  >  7954. StartupDelay:   1. ConnectTime:   -. Runtime: 100. 20
>  >  7954. StartupDelay:  10. ConnectTime:   -. Runtime: 100. 16
>  >
>  >  It can be seen that thread 2 reaches the getConnection() method some
>  >  while before the next thread (7724-7637=87ms) yet takes longer than
>  >  any of the other threads to return from the method.
>  >
>  >  This could mean that there is a window somewhere in the code that
>  >  causes this thread to pause whilst others progress. There does seem to
>  >  be a problem here.
>  >
>  >  I need to do a bit more analysis on the numbers (and perhaps update
>  >  the debug) to see how the connected threads overlap.
>  >
>
>
> A bit more analysis (see below) shows that the first thread completes
>  (releases its connnection) before some of the other threads call
>  getConnection(). All the other threads release their connections after
>  the last thread invokes getConnection(), so only the first waiting
>  thread gets a connection.
>
>  One way to fix this might be to change the holdTime to 3*waitTime
>  instead of twice. Or more.
>
>  However there is still a puzzle - why does the second thread take so
>  long to get the connection? And why do the threads that take longer
>  than waitTime to get a connection not fail with a timeout?
>
>  PreConnect
>  !    EndRun = PreConnect+RunTime
>  !    !    StartDelay (elapsed)
>  !    !    !   getConn() (elapsed)
>  !    !    !   !   RunTime (elapsed)
>  !    !    !   !   !   IDX
>  7635 7853  13   2 218 1
>  7637 8118   0 280 481 2
>  7724 8118   0 193 394 4
>  7737 8118  13 180 381 5
>  7750 8118 112 167 368 3
>  7775 8118  38 142 343 6
>  7785 8118  10 132 333 7
>  7816 8118  31 101 302 8
>  7853 8055  38   1 202 9
>  7865 8117  11  52 252 10
>  7874 8117   9  43 243 11
>  7918 8027  44   - 109 12
>  7927 8040   9   - 113 13
>  7941 8045  14   - 104 14
>  7944 8045   3   - 101 15
>  7953 8053   0   - 100 17
>  7953 8053   0   - 100 18
>  7953 8054   0   - 101 19
>  7954 8054   1   - 100 20
>  7954 8054  10   - 100 16
>
>  The PreConnect and Endtime values are in ms from an arbitrary point in time
>  The elapsed times are in ms.

Rats! The problem is clearly not solved, see failures from

http://vmbuild.apache.org/continuum/buildResult.action?buildId=269039&projectId=22

In both cases, the earliest connection release (Post) is after the
latest getConnection (Pre), by a margin of about 80ms.

These look like a different failure mode from the previous run, where
one connection was released too early.

Unless there is some additional processing that goes on after the
connection has been made available but before the conn.close() can
return? In which case, maybe there would be time for the connection to
be snapped up.

I propose increasing the hold time to 3* wait and see what happens.

Pre:   8. Post: 213. Startup:  8. getConn():   1. Runtime: 205. IDX: 0.
Pre:  39. Post: 352. Startup: 29. getConn(): 108. Runtime: 313. IDX: 1.
Pre:  24. Post: 351. Startup:  1. getConn(): 123. Runtime: 327. IDX: 2.
Pre:  24. Post: 352. Startup:  1. getConn(): 124. Runtime: 328. IDX: 3.
Pre:  24. Post: 352. Startup:  0. getConn(): 124. Runtime: 328. IDX: 4.
Pre:  24. Post: 352. Startup:  0. getConn(): 124. Runtime: 328. IDX: 5.
Pre:  43. Post: 351. Startup:  3. getConn(): 104. Runtime: 308. IDX: 6.
Pre:  86. Post: 351. Startup:  1. getConn():  61. Runtime: 265. IDX: 7.
Pre: 114. Post: 351. Startup: 28. getConn():  33. Runtime: 237. IDX: 8.
Pre: 116. Post: 351. Startup:  2. getConn():  31. Runtime: 235. IDX: 9.
Pre: 116. Post: 417. Startup:  0. getConn():  99. Runtime: 301. IDX: 10.
Pre: 148. Post:   -. Startup: 32. getConn():   -. Runtime: 100. IDX: 11.
Pre: 148. Post:   -. Startup:  0. getConn():   -. Runtime: 100. IDX: 12.
Pre: 149. Post:   -. Startup:  1. getConn():   -. Runtime: 100. IDX: 13.
Pre: 149. Post:   -. Startup:  1. getConn():   -. Runtime: 100. IDX: 14.
Pre: 150. Post:   -. Startup:  1. getConn():   -. Runtime: 100. IDX: 15.
Pre: 150. Post:   -. Startup:  1. getConn():   -. Runtime: 100. IDX: 16.
Pre: 150. Post:   -. Startup:  0. getConn():   -. Runtime: 100. IDX: 17.
Pre: 152. Post:   -. Startup:  2. getConn():   -. Runtime: 100. IDX: 18.
Pre: 152. Post:   -. Startup:  1. getConn():   -. Runtime: 100. IDX: 19.

Pre:   0. Post: 201. Startup:  0. getConn():   1. Runtime: 201. IDX: 0.
Pre:   1. Post: 319. Startup:  0. getConn(): 117. Runtime: 318. IDX: 1.
Pre:  43. Post: 320. Startup: 42. getConn():  76. Runtime: 277. IDX: 2.
Pre:  43. Post: 319. Startup:  0. getConn():  76. Runtime: 276. IDX: 3.
Pre:  68. Post: 320. Startup: 25. getConn():  51. Runtime: 252. IDX: 4.
Pre:  68. Post: 320. Startup:  0. getConn():  51. Runtime: 252. IDX: 5.
Pre:  93. Post: 319. Startup: 25. getConn():  26. Runtime: 226. IDX: 6.
Pre:  94. Post: 319. Startup:  1. getConn():  25. Runtime: 225. IDX: 7.
Pre: 118. Post: 319. Startup: 24. getConn():   1. Runtime: 201. IDX: 8.
Pre: 120. Post: 320. Startup:  2. getConn():   0. Runtime: 200. IDX: 9.
Pre: 120. Post: 402. Startup:  0. getConn():  81. Runtime: 282. IDX: 10.
Pre: 120. Post:   -. Startup:  0. getConn():   -. Runtime: 101. IDX: 11.
Pre: 121. Post:   -. Startup:  1. getConn():   -. Runtime: 100. IDX: 12.
Pre: 121. Post:   -. Startup:  0. getConn():   -. Runtime: 100. IDX: 13.
Pre: 121. Post:   -. Startup:  0. getConn():   -. Runtime: 100. IDX: 14.
Pre: 121. Post:   -. Startup:  0. getConn():   -. Runtime: 100. IDX: 15.
Pre: 121. Post:   -. Startup:  0. getConn():   -. Runtime: 101. IDX: 16.
Pre: 122. Post:   -. Startup:  1. getConn():   -. Runtime: 100. IDX: 17.
Pre: 122. Post:   -. Startup:  0. getConn():   -. Runtime: 100. IDX: 18.
Pre: 122. Post:   -. Startup:  0. getConn():   -. Runtime: 100. IDX: 19.


>
>  S///
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to