Of course, we have some really hard pages to fix, and many more that timeout only a few times a day, but we've fixed enough pages to be managably down to a 9 second cap.
This doesn't mean we can stop working on timeouts :) - but it does mean, at least for a while, that I won't be moving the hard timeout value down if doing so would add new timeout bugs. It is time to consolidate and focus on the second half of the stretch goal Francis set : 9 second timeout + no critical bugs. (1/3 of the critical bugs are timeouts). The longer term goal is still a 5 second timeout with 1 second 99th percentile... and we had a discussion a few weeks back about setting the timeout for *new* pages to 5 seconds straight away. Thats still not totally settled, but I think its time we looked into how to make that work. In the mean time the hard timeout default value can sit at 9 seconds. If we get to the point where it could be dropped another second without adding critical bugs, I'll definitely do that - but only if it won't be adding bugs :). (dropping it provides a backstop against misbehaving pages, its an important overall thing to get it low). The following pages have timeout exceptions at the moment: hard_timeout default 0 9000 hard_timeout pageid:BugTask:+create-question 12 20000 hard_timeout pageid:Distribution:+bugs 4 10000 hard_timeout pageid:Distribution:+bugtarget-portlet-tags-content 3 10000 hard_timeout pageid:Distribution:EntryResource:searchTasks 5 10000 hard_timeout pageid:Question:+index 18 11000 hard_timeout pageid:RootObject:+login 1 20000 Question:+index because it takes a very long time before it does its commit - even without mail spooling its a slow page that doesn't improve with retries. Ditto BugTask:+create-question Distribution:+bugs because we have some difficult performance work to do on search, and its not inside the time frames suitable for maintenance squads - at least, as assessed so far. The tags portlet should be temporary until we deploy the bugsumary table. pageid:Distribution:EntryResource:searchTasks seems to be driven by a script - perhaps arsenal - but its getting into offsets of*thousands* in the DB : we really need to address the batching logic. Finally, RootObject:+login is exempted because we're running into SSO backend delays which we have little visibility into as a team - there is a bug open on canonical-identity-provider about performance, and I've volunteered our collective knowledge if the ISD team have any trouble analysing how or why the thing is slow - we've all learnt a lot about addressing performance in the last 10 months, so please feel free to share :) -Rob _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : launchpad-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp