On 13/01/2010, sebb <seb...@gmail.com> wrote: > On 13/01/2010, sebb <seb...@gmail.com> wrote: > > Here is the output from > > > http://vmbuild.apache.org/continuum/buildResult.action?buildId=268733&projectId=22 > > > > I've rearranged it so the threads are in ConnectionStart order. > > I moved ConnectionStart to the beginning, added the Thread number at > > the end (dropping irrelevant stuff), and deleted the prefix > > "ConnectStart: 355958" > > > > 7635. StartupDelay: 13. ConnectTime: 2. Runtime: 218. 1 > > 7637. StartupDelay: 0. ConnectTime: 280. Runtime: 481. 2 > > 7724. StartupDelay: 0. ConnectTime: 193. Runtime: 394. 4 > > 7737. StartupDelay: 13. ConnectTime: 180. Runtime: 381. 5 > > 7750. StartupDelay: 112. ConnectTime: 167. Runtime: 368. 3 > > 7775. StartupDelay: 38. ConnectTime: 142. Runtime: 343. 6 > > 7785. StartupDelay: 10. ConnectTime: 132. Runtime: 333. 7 > > 7816. StartupDelay: 31. ConnectTime: 101. Runtime: 302. 8 > > 7853. StartupDelay: 38. ConnectTime: 1. Runtime: 202. 9 > > 7865. StartupDelay: 11. ConnectTime: 52. Runtime: 252. 10 > > 7874. StartupDelay: 9. ConnectTime: 43. Runtime: 243. 11 > > 7918. StartupDelay: 44. ConnectTime: -. Runtime: 109. 12 > > 7927. StartupDelay: 9. ConnectTime: -. Runtime: 113. 13 > > 7941. StartupDelay: 14. ConnectTime: -. Runtime: 104. 14 > > 7944. StartupDelay: 3. ConnectTime: -. Runtime: 101. 15 > > 7953. StartupDelay: 0. ConnectTime: -. Runtime: 100. 17 > > 7953. StartupDelay: 0. ConnectTime: -. Runtime: 100. 18 > > 7953. StartupDelay: 0. ConnectTime: -. Runtime: 101. 19 > > 7954. StartupDelay: 1. ConnectTime: -. Runtime: 100. 20 > > 7954. StartupDelay: 10. ConnectTime: -. Runtime: 100. 16 > > > > It can be seen that thread 2 reaches the getConnection() method some > > while before the next thread (7724-7637=87ms) yet takes longer than > > any of the other threads to return from the method. > > > > This could mean that there is a window somewhere in the code that > > causes this thread to pause whilst others progress. There does seem to > > be a problem here. > > > > I need to do a bit more analysis on the numbers (and perhaps update > > the debug) to see how the connected threads overlap. > > > > > A bit more analysis (see below) shows that the first thread completes > (releases its connnection) before some of the other threads call > getConnection(). All the other threads release their connections after > the last thread invokes getConnection(), so only the first waiting > thread gets a connection. > > One way to fix this might be to change the holdTime to 3*waitTime > instead of twice. Or more. > > However there is still a puzzle - why does the second thread take so > long to get the connection? And why do the threads that take longer > than waitTime to get a connection not fail with a timeout? > > PreConnect > ! EndRun = PreConnect+RunTime > ! ! StartDelay (elapsed) > ! ! ! getConn() (elapsed) > ! ! ! ! RunTime (elapsed) > ! ! ! ! ! IDX > 7635 7853 13 2 218 1 > 7637 8118 0 280 481 2 > 7724 8118 0 193 394 4 > 7737 8118 13 180 381 5 > 7750 8118 112 167 368 3 > 7775 8118 38 142 343 6 > 7785 8118 10 132 333 7 > 7816 8118 31 101 302 8 > 7853 8055 38 1 202 9 > 7865 8117 11 52 252 10 > 7874 8117 9 43 243 11 > 7918 8027 44 - 109 12 > 7927 8040 9 - 113 13 > 7941 8045 14 - 104 14 > 7944 8045 3 - 101 15 > 7953 8053 0 - 100 17 > 7953 8053 0 - 100 18 > 7953 8054 0 - 101 19 > 7954 8054 1 - 100 20 > 7954 8054 10 - 100 16 > > The PreConnect and Endtime values are in ms from an arbitrary point in time > The elapsed times are in ms.
Rats! The problem is clearly not solved, see failures from http://vmbuild.apache.org/continuum/buildResult.action?buildId=269039&projectId=22 In both cases, the earliest connection release (Post) is after the latest getConnection (Pre), by a margin of about 80ms. These look like a different failure mode from the previous run, where one connection was released too early. Unless there is some additional processing that goes on after the connection has been made available but before the conn.close() can return? In which case, maybe there would be time for the connection to be snapped up. I propose increasing the hold time to 3* wait and see what happens. Pre: 8. Post: 213. Startup: 8. getConn(): 1. Runtime: 205. IDX: 0. Pre: 39. Post: 352. Startup: 29. getConn(): 108. Runtime: 313. IDX: 1. Pre: 24. Post: 351. Startup: 1. getConn(): 123. Runtime: 327. IDX: 2. Pre: 24. Post: 352. Startup: 1. getConn(): 124. Runtime: 328. IDX: 3. Pre: 24. Post: 352. Startup: 0. getConn(): 124. Runtime: 328. IDX: 4. Pre: 24. Post: 352. Startup: 0. getConn(): 124. Runtime: 328. IDX: 5. Pre: 43. Post: 351. Startup: 3. getConn(): 104. Runtime: 308. IDX: 6. Pre: 86. Post: 351. Startup: 1. getConn(): 61. Runtime: 265. IDX: 7. Pre: 114. Post: 351. Startup: 28. getConn(): 33. Runtime: 237. IDX: 8. Pre: 116. Post: 351. Startup: 2. getConn(): 31. Runtime: 235. IDX: 9. Pre: 116. Post: 417. Startup: 0. getConn(): 99. Runtime: 301. IDX: 10. Pre: 148. Post: -. Startup: 32. getConn(): -. Runtime: 100. IDX: 11. Pre: 148. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 12. Pre: 149. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 13. Pre: 149. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 14. Pre: 150. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 15. Pre: 150. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 16. Pre: 150. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 17. Pre: 152. Post: -. Startup: 2. getConn(): -. Runtime: 100. IDX: 18. Pre: 152. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 19. Pre: 0. Post: 201. Startup: 0. getConn(): 1. Runtime: 201. IDX: 0. Pre: 1. Post: 319. Startup: 0. getConn(): 117. Runtime: 318. IDX: 1. Pre: 43. Post: 320. Startup: 42. getConn(): 76. Runtime: 277. IDX: 2. Pre: 43. Post: 319. Startup: 0. getConn(): 76. Runtime: 276. IDX: 3. Pre: 68. Post: 320. Startup: 25. getConn(): 51. Runtime: 252. IDX: 4. Pre: 68. Post: 320. Startup: 0. getConn(): 51. Runtime: 252. IDX: 5. Pre: 93. Post: 319. Startup: 25. getConn(): 26. Runtime: 226. IDX: 6. Pre: 94. Post: 319. Startup: 1. getConn(): 25. Runtime: 225. IDX: 7. Pre: 118. Post: 319. Startup: 24. getConn(): 1. Runtime: 201. IDX: 8. Pre: 120. Post: 320. Startup: 2. getConn(): 0. Runtime: 200. IDX: 9. Pre: 120. Post: 402. Startup: 0. getConn(): 81. Runtime: 282. IDX: 10. Pre: 120. Post: -. Startup: 0. getConn(): -. Runtime: 101. IDX: 11. Pre: 121. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 12. Pre: 121. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 13. Pre: 121. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 14. Pre: 121. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 15. Pre: 121. Post: -. Startup: 0. getConn(): -. Runtime: 101. IDX: 16. Pre: 122. Post: -. Startup: 1. getConn(): -. Runtime: 100. IDX: 17. Pre: 122. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 18. Pre: 122. Post: -. Startup: 0. getConn(): -. Runtime: 100. IDX: 19. > > S/// > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org