On Mon, Sep 20, 2010 at 8:43 PM, Patrick Hunt <ph...@cloudera.com> wrote: > Hi. Improving resource use is a great goal, I'm not sure it's that > clearcut though. I'm only familiar with ZK: note that these two jobs > are our patch queues, which only gets run when a user submits a patch > to a jira (only a few patches on each job over the last couple > months): > Zookeeper-Patch-h1.grid.sp2.yahoo.net | 2 mo 2 days > Zookeeper-Patch-h7.grid.sp2.yahoo.net | 1 mo 16 days > this may fail for any number of reasons (patch won't apply, no tests, > findbug issues, etc...) Also notice that a patch gets sent to only 1 > of 3 possible machines in some pseudo random fashion. So while one > patch job shows a recent success, the others do not. So to some extent > this is out of our hands.
I agree that this might be a problem. However, looking at these specific jobs they both seem be out of the list if they had been fully maintained. E.g. the following build seems to have failed due to a build configuration problem: https://hudson.apache.org/hudson/job/Zookeeper-Patch-h7.grid.sp2.yahoo.net/110/console There a quite a few builds failing for the same reason. Should these have been fixed, this job would not have been caught by the script. That said, I'll be happy to maintain a white list of jobs in the disable script. So, if you got a job that has a good reason for being failed for a long time, let me know. > We also see frequent failures from things that seem like infrastruture > issues, here's there console output from a couple recent failures: > WARNING: clock of the subversion server appears to be out of sync. > This can result in inconsistent check out behavior. > > here's another: > Checking out http://svn.apache.org/repos/asf/hadoop/zookeeper/trunk > ERROR: Failed to check out > http://svn.apache.org/repos/asf/hadoop/zookeeper/trunk > org.tmatesoft.svn.core.SVNException: svn: unknown host > svn: OPTIONS request failed on '/repos/asf/hadoop/zookeeper/trunk' > > that said, we recently had issues with our trunk that were causing > intermittent failures. We've been working on those and hopefully it > will help to clear these patch issues. Yes, there will always be builds failing for these reasons (e.g. Hudson instability, network problems). I would recommend to delete these builds as they do not reflect problems in your build and doesn't add much knowledge (besides bad statistics). That's what I do for the jobs I maintain. /niklas