Re: Builds that have been failing for a while
Be warned, I'll run the script to disable build which been failing for more than 31 days on Sunday. This is the current list of such jobs: ActiveMQ-SysTest-5.3 | 6 mo 25 days AsyncWeb | 3 mo 2 days Cayenne-doc| 1 mo 21 days clerezza-site | 3 mo 21 days Empire-DB multios | 1 mo 23 days Felix-FileInstall | 1 mo 21 days Felix-Gogo | 2 mo 11 days Felix-WebConsole | 1 mo 23 days Hadoop-20-Build| 2 yr 3 mo Hadoop-Hdfs-21-Build | 5 mo 5 days Hadoop-Hdfs-trunk | 5 mo 20 days Hadoop-Mapreduce-21-Build | 3 mo 25 days Hadoop-Mapreduce-trunk | 3 mo 25 days Hadoop-Mapreduce-trunk-Commit | 5 mo 20 days Hadoop-Patch-h1.grid.sp2.yahoo.net | 3 mo 21 days Hadoop-Patch-h4.grid.sp2.yahoo.net | 3 mo 16 days Hadoop-Patch-h9.grid.sp2.yahoo.net | 8 mo 12 days Hama-Patch | 2 mo 28 days Hama-Patch-Admin | 1 mo 10 days Hdfs-Patch-h2.grid.sp2.yahoo.net | 5 mo 21 days Hdfs-Patch-h5.grid.sp2.yahoo.net | 5 mo 21 days Hive-trunk-h0.18 | 1 mo 10 days Hive-trunk-h0.19 | 4 mo 20 days Jackrabbit-1.6 | 1 mo 12 days Jackrabbit-classloader | 3 mo 10 days Jackrabbit-ocm | 3 mo 10 days jspf-trunk | 1 mo 10 days Mahout-Patch-Admin | 1 yr 11 mo mailet-standard-trunk | 2 mo 4 days Mapreduce-Patch-h3.grid.sp2.yahoo.net | 4 mo 27 days Mapreduce-Patch-h4.grid.sp2.yahoo.net | 3 mo 29 days Mapreduce-Patch-h6.grid.sp2.yahoo.net | 4 mo 16 days Mapreduce-Patch-h9.grid.sp2.yahoo.net | 6 mo 23 days Nutch-trunk| 2 mo 19 days org.apache.kato.eclipse| 1 yr 2 mo Pig-Patch-h7.grid.sp2.yahoo.net| 3 mo 17 days Pig-Patch-h8.grid.sp2.yahoo.net| 5 mo 0 days ServiceMix-Plugins | 2 mo 12 days ServiceMix-Utils | 2 mo 10 days ServiceMix3| 1 mo 3 days Shiro | 1 mo 13 days struts-annotations | 1 yr 1 mo tapestry-5.0-freestyle | 6 mo 11 days TestBuilds | 1 yr 0 mo Turbine Fulcrum| 3 mo 21 days Tuscany-1x | 9 mo 6 days Tuscany-run-plugin | 3 mo 28 days Zookeeper-Patch-h1.grid.sp2.yahoo.net | 2 mo 6 days Zookeeper-Patch-h7.grid.sp2.yahoo.net | 1 mo 20 days /niklas
Re: Builds that have been failing for a while
Hi, On Fri, Sep 24, 2010 at 10:37 AM, Niklas Gustavsson wrote: > Be warned, I'll run the script to disable build which been failing for > more than 31 days on Sunday. This is the current list of such jobs: > [...] > Jackrabbit-1.6 | 1 mo 12 days > Jackrabbit-classloader | 3 mo 10 days > Jackrabbit-ocm | 3 mo 10 days These are builds that are configured to run only when there's a change in the related codebase, so even if they've been red for a long time, they don't really consume build resources. As soon as someone gets around to fixing the pending errors, I expect the CI build to start up again automatically to verify the fix. I suggest that we only disable *periodic* building of codebases that have been failing for a long time. BR, Jukka Zitting
Re: Builds that have been failing for a while
On Fri, Sep 24, 2010 at 10:48 AM, Jukka Zitting wrote: > On Fri, Sep 24, 2010 at 10:37 AM, Niklas Gustavsson > wrote: >> Jackrabbit-1.6 | 1 mo 12 days >> Jackrabbit-classloader | 3 mo 10 days >> Jackrabbit-ocm | 3 mo 10 days > > These are builds that are configured to run only when there's a change > in the related codebase, so even if they've been red for a long time, > they don't really consume build resources. As soon as someone gets > around to fixing the pending errors, I expect the CI build to start up > again automatically to verify the fix. > > I suggest that we only disable *periodic* building of codebases that > have been failing for a long time. These three builds are set to be checking for updates on a periodic basis (polling the SCM every hour) and when upstream dependencies are built. /niklas
Re: Builds that have been failing for a while
Hi, On Fri, Sep 24, 2010 at 11:05 AM, Niklas Gustavsson wrote: > These three builds are set to be checking for updates on a periodic > basis (polling the SCM every hour) and when upstream dependencies are > built. That shouldn't be too much of a burden, or is it? It doesn't tie up executors like some of the other failing builds. I'm all for disabling builds that continuously keep failing, but in these cases only the last build has failed, and I totally expect the builds to go blue again as soon as someone gets around to touching the codebases. Instead of the time limit, would it make more sense to only disable those jobs where >n of the last builds have failed? BR, Jukka Zitting
Re: Builds that have been failing for a while
On Fri, Sep 24, 2010 at 11:24 AM, Jukka Zitting wrote: > That shouldn't be too much of a burden, or is it? It doesn't tie up > executors like some of the other failing builds. It does tie up an SCMTrigger which is a resource that keeps failing and does require administration (they will get stuck when slaves fail and requires killing or they will keep a thread stuck forever). That said, it is certainly not as resource intensive as running the full build. > Instead of the time limit, would it make more sense to only disable > those jobs where >n of the last builds have failed? Reasonable idea, let me play around with a script for that purpose and get back with a new list to compare. /niklas
RE: Builds that have been failing for a while
> -Original Message- > From: Jukka Zitting [mailto:jukka.zitt...@gmail.com] > Sent: Friday, 24 September 2010 7:25 PM > To: builds@apache.org > Subject: Re: Builds that have been failing for a while > > Hi, > > On Fri, Sep 24, 2010 at 11:05 AM, Niklas Gustavsson > wrote: > > These three builds are set to be checking for updates on a periodic > > basis (polling the SCM every hour) and when upstream dependencies are > > built. > > That shouldn't be too much of a burden, or is it? It doesn't tie up > executors like some of the other failing builds. > > I'm all for disabling builds that continuously keep failing, but in > these cases only the last build has failed, and I totally expect the > builds to go blue again as soon as someone gets around to touching the > codebases. > > Instead of the time limit, would it make more sense to only disable > those jobs where >n of the last builds have failed? Depends on the trigger frequency, last n builds could be used up in one day by some projects and take months to reach for others. I would suggest either a combination of both methods - perhaps time of 30 days .and. the last 5 builds failed, or something like that? this is a new thing that needs doing, we can't have everyone replying saying 'oh yeah please don't disable my build due to blah ...' . Lets find a sensible setting and stick to it. The aim is to get people to fix their builds or they will be disabled until they are fixed, simple. Gav... > > BR, > > Jukka Zitting
Re: Builds that have been failing for a while
On Fri, Sep 24, 2010 at 11:44 AM, Gav... wrote: > I would suggest either a combination of both methods - perhaps time of 30 > days .and. > the last 5 builds failed, or something like that? That was my plan as well. Here's the list of jobs that has failed for more than one month and with more than 3 unsuccessful builds in a row: Cayenne-doc| 1 mo 21 days | 13 clerezza-site | 3 mo 21 days | 7 Felix-WebConsole | 1 mo 23 days | 7 Hadoop-20-Build| 2 yr 3 mo | 15 Hadoop-Hdfs-21-Build | 5 mo 6 days | 15 Hadoop-Hdfs-trunk | 5 mo 20 days | 40 Hadoop-Mapreduce-21-Build | 3 mo 25 days | 15 Hadoop-Mapreduce-trunk | 3 mo 25 days | 38 Hadoop-Mapreduce-trunk-Commit | 5 mo 20 days | 30 Hadoop-Patch-h4.grid.sp2.yahoo.net | 3 mo 16 days | 18 Hadoop-Patch-h9.grid.sp2.yahoo.net | 8 mo 12 days | 8 Hama-Patch-Admin | 1 mo 10 days | 5 Hdfs-Patch-h2.grid.sp2.yahoo.net | 5 mo 21 days | 11 Hdfs-Patch-h5.grid.sp2.yahoo.net | 5 mo 21 days | 11 Hive-trunk-h0.18 | 1 mo 10 days | 17 Hive-trunk-h0.19 | 4 mo 20 days | 17 jspf-trunk | 1 mo 10 days | 5 Mahout-Patch-Admin | 1 yr 11 mo | 5 mailet-standard-trunk | 2 mo 4 days | 4 Mapreduce-Patch-h4.grid.sp2.yahoo.net | 3 mo 29 days | 5 Mapreduce-Patch-h6.grid.sp2.yahoo.net | 4 mo 16 days | 4 Nutch-trunk| 2 mo 19 days | 40 Pig-Patch-h7.grid.sp2.yahoo.net| 3 mo 17 days | 23 Pig-Patch-h8.grid.sp2.yahoo.net| 5 mo 0 days | 33 Shiro | 1 mo 13 days | 4 tapestry-5.0-freestyle | 6 mo 11 days | 5 Zookeeper-Patch-h7.grid.sp2.yahoo.net | 1 mo 20 days | 11 /niklas
Re: Builds that have been failing for a while
Hi, On 24.09.2010 14:20, Niklas Gustavsson wrote: > On Fri, Sep 24, 2010 at 11:44 AM, Gav... wrote: >> I would suggest either a combination of both methods - perhaps time of 30 >> days .and. >> the last 5 builds failed, or something like that? > > That was my plan as well. Here's the list of jobs that has failed for > more than one month and with more than 3 unsuccessful builds in a row: > [...] Perhaps it might make sense to also notify the respective PMCs and advertise this mailinglist as I could imagine that some PMCs are not even aware of it. Tammo -- Tammo van Lessen - http://www.taval.de
Re: Builds that have been failing for a while
2010/9/24 Tammo van Lessen > > Perhaps it might make sense to also notify the respective PMCs and > advertise this mailinglist as I could imagine that some PMCs are not > even aware of it. > > Tammo > > -- > Tammo van Lessen - http://www.taval.de > +1 Regarding clerezza-site job I am taking a look at possible failing causes. Regards, Tommaso
Re: Builds that have been failing for a while
On Fri, Sep 24, 2010 at 2:36 PM, Tammo van Lessen wrote: > Perhaps it might make sense to also notify the respective PMCs and > advertise this mailinglist as I could imagine that some PMCs are not > even aware of it. We do ask those which gets access to Hudson to follow this list for this exact purpose. /niklas
SSH problem again
Hi, The SSH problem is there again. Building remotely on ubuntu1 hudson.util.IOException2: remote file operation failed: /home/hudson/hudson-slave/workspace/dir-skins-jdk15-ubuntu-deploy-site at hudson.remoting.chan...@75167bb3:ubuntu1 at hudson.FilePath.act(FilePath.java:749) at hudson.FilePath.act(FilePath.java:735) at hudson.FilePath.mkdirs(FilePath.java:801) at hudson.model.AbstractProject.checkout(AbstractProject.java:1059) at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:479) at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:411) at hudson.model.Run.run(Run.java:1273) at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:291) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:129) Caused by: java.io.IOException: SSH channel is closed. (Close requested by remote) at com.trilead.ssh2.channel.ChannelManager.sendData(ChannelManager.java:383) at com.trilead.ssh2.channel.ChannelOutputStream.write(ChannelOutputStream.java:63) at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1838) at java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1876) at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1537) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:329) at hudson.remoting.Channel.send(Channel.java:419) at hudson.remoting.Request.call(Request.java:105) at hudson.remoting.Channel.call(Channel.java:557) at hudson.FilePath.act(FilePath.java:742) ... 9 more Kind Regards, Stefan