On 13/01/2010, sebb <seb...@gmail.com> wrote: > On 13/01/2010, sebb <seb...@gmail.com> wrote: > > On 13/01/2010, sebb <seb...@gmail.com> wrote: > > > Here is the output from > > > > http://vmbuild.apache.org/continuum/buildResult.action?buildId=268733&projectId=22 > > > > > > I've rearranged it so the threads are in ConnectionStart order. > > > I moved ConnectionStart to the beginning, added the Thread number at > > > the end (dropping irrelevant stuff), and deleted the prefix > > > "ConnectStart: 355958" > > > > > > 7635. StartupDelay: 13. ConnectTime: 2. Runtime: 218. 1 > > > 7637. StartupDelay: 0. ConnectTime: 280. Runtime: 481. 2 > > > 7724. StartupDelay: 0. ConnectTime: 193. Runtime: 394. 4 > > > 7737. StartupDelay: 13. ConnectTime: 180. Runtime: 381. 5 > > > 7750. StartupDelay: 112. ConnectTime: 167. Runtime: 368. 3 > > > 7775. StartupDelay: 38. ConnectTime: 142. Runtime: 343. 6 > > > 7785. StartupDelay: 10. ConnectTime: 132. Runtime: 333. 7 > > > 7816. StartupDelay: 31. ConnectTime: 101. Runtime: 302. 8 > > > 7853. StartupDelay: 38. ConnectTime: 1. Runtime: 202. 9 > > > 7865. StartupDelay: 11. ConnectTime: 52. Runtime: 252. 10 > > > 7874. StartupDelay: 9. ConnectTime: 43. Runtime: 243. 11 > > > 7918. StartupDelay: 44. ConnectTime: -. Runtime: 109. 12 > > > 7927. StartupDelay: 9. ConnectTime: -. Runtime: 113. 13 > > > 7941. StartupDelay: 14. ConnectTime: -. Runtime: 104. 14 > > > 7944. StartupDelay: 3. ConnectTime: -. Runtime: 101. 15 > > > 7953. StartupDelay: 0. ConnectTime: -. Runtime: 100. 17 > > > 7953. StartupDelay: 0. ConnectTime: -. Runtime: 100. 18 > > > 7953. StartupDelay: 0. ConnectTime: -. Runtime: 101. 19 > > > 7954. StartupDelay: 1. ConnectTime: -. Runtime: 100. 20 > > > 7954. StartupDelay: 10. ConnectTime: -. Runtime: 100. 16 > > > > > > It can be seen that thread 2 reaches the getConnection() method some > > > while before the next thread (7724-7637=87ms) yet takes longer than > > > any of the other threads to return from the method. > > > > > > This could mean that there is a window somewhere in the code that > > > causes this thread to pause whilst others progress. There does seem to > > > be a problem here. > > > > > > I need to do a bit more analysis on the numbers (and perhaps update > > > the debug) to see how the connected threads overlap. > > > > > > > > > A bit more analysis (see below) shows that the first thread completes > > (releases its connnection) before some of the other threads call > > getConnection(). All the other threads release their connections after > > the last thread invokes getConnection(), so only the first waiting > > thread gets a connection. > > > > One way to fix this might be to change the holdTime to 3*waitTime > > instead of twice. Or more. > > > > However there is still a puzzle - why does the second thread take so > > long to get the connection? And why do the threads that take longer > > than waitTime to get a connection not fail with a timeout? > > > > PreConnect > > ! EndRun = PreConnect+RunTime > > ! ! StartDelay (elapsed) > > ! ! ! getConn() (elapsed) > > ! ! ! ! RunTime (elapsed) > > ! ! ! ! ! IDX > > 7635 7853 13 2 218 1 > > 7637 8118 0 280 481 2 > > 7724 8118 0 193 394 4 > > 7737 8118 13 180 381 5 > > 7750 8118 112 167 368 3 > > 7775 8118 38 142 343 6 > > 7785 8118 10 132 333 7 > > 7816 8118 31 101 302 8 > > 7853 8055 38 1 202 9 > > 7865 8117 11 52 252 10 > > 7874 8117 9 43 243 11 > > 7918 8027 44 - 109 12 > > 7927 8040 9 - 113 13 > > 7941 8045 14 - 104 14 > > 7944 8045 3 - 101 15 > > 7953 8053 0 - 100 17 > > 7953 8053 0 - 100 18 > > 7953 8054 0 - 101 19 > > 7954 8054 1 - 100 20 > > 7954 8054 10 - 100 16 > > > > The PreConnect and Endtime values are in ms from an arbitrary point in > time > > The elapsed times are in ms. > > > Rats! The problem is clearly not solved, see failures from > > > http://vmbuild.apache.org/continuum/buildResult.action?buildId=269039&projectId=22 > > In both cases, the earliest connection release (Post) is after the > latest getConnection (Pre), by a margin of about 80ms. > > These look like a different failure mode from the previous run, where > one connection was released too early. > > Unless there is some additional processing that goes on after the > connection has been made available but before the conn.close() can > return? In which case, maybe there would be time for the connection to > be snapped up. > > I propose increasing the hold time to 3* wait and see what happens.
I did this, and it has helped - usually now only one of the testMultiThreaded2() tests fails. However, it's not necessarily the one with holdTime = 3 * waitTime. It's looking less and less likely that the problem is due to one thread releasing the connection early enough to allow another thread to grab it. > Pre: 8. Post: 213. Startup: 8. getConn(): 1. Runtime: 205. IDX: 0. > Pre: 39. Post: 352. Startup: 29. getConn(): 108. Runtime: 313. IDX: 1. > Pre: 24. Post: 351. Startup: 1. getConn(): 123. Runtime: 327. IDX: 2. > Pre: 24. Post: 352. Startup: 1. getConn(): 124. Runtime: 328. IDX: 3. > Pre: 24. Post: 352. Startup: 0. getConn(): 124. Runtime: 328. IDX: 4. > Pre: 24. Post: 352. Startup: 0. getConn(): 124. Runtime: 328. IDX: 5. > Pre: 43. Post: 351. Startup: 3. getConn(): 104. Runtime: 308. IDX: 6. > Pre: 86. Post: 351. Startup: 1. getConn(): 61. Runtime: 265. IDX: 7. > Pre: 114. Post: 351. Startup: 28. getConn(): 33. Runtime: 237. IDX: 8. > Pre: 116. Post: 351. Startup: 2. getConn(): 31. Runtime: 235. IDX: 9. > Pre: 116. Post: 417. Startup: 0. getConn(): 99. Runtime: 301. IDX: 10. > Pre: 148. Post: -. Startup: 32. getConn(): -. Runtime: 100. IDX: 11. > Pre: 148. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 12. > Pre: 149. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 13. > Pre: 149. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 14. > Pre: 150. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 15. > Pre: 150. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 16. > Pre: 150. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 17. > Pre: 152. Post: -. Startup: 2. getConn(): -. Runtime: 100. IDX: 18. > Pre: 152. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 19. > > Pre: 0. Post: 201. Startup: 0. getConn(): 1. Runtime: 201. IDX: 0. > Pre: 1. Post: 319. Startup: 0. getConn(): 117. Runtime: 318. IDX: 1. > Pre: 43. Post: 320. Startup: 42. getConn(): 76. Runtime: 277. IDX: 2. > Pre: 43. Post: 319. Startup: 0. getConn(): 76. Runtime: 276. IDX: 3. > Pre: 68. Post: 320. Startup: 25. getConn(): 51. Runtime: 252. IDX: 4. > Pre: 68. Post: 320. Startup: 0. getConn(): 51. Runtime: 252. IDX: 5. > Pre: 93. Post: 319. Startup: 25. getConn(): 26. Runtime: 226. IDX: 6. > Pre: 94. Post: 319. Startup: 1. getConn(): 25. Runtime: 225. IDX: 7. > Pre: 118. Post: 319. Startup: 24. getConn(): 1. Runtime: 201. IDX: 8. > Pre: 120. Post: 320. Startup: 2. getConn(): 0. Runtime: 200. IDX: 9. > Pre: 120. Post: 402. Startup: 0. getConn(): 81. Runtime: 282. IDX: 10. > Pre: 120. Post: -. Startup: 0. getConn(): -. Runtime: 101. IDX: 11. > Pre: 121. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 12. > Pre: 121. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 13. > Pre: 121. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 14. > Pre: 121. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 15. > Pre: 121. Post: -. Startup: 0. getConn(): -. Runtime: 101. IDX: 16. > Pre: 122. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 17. > Pre: 122. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 18. > Pre: 122. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 19. > > > > > > S/// > > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org