Fwd: buildbot success in ASF Buildbot on hadoop-trunk
Why are we building Hadoop on Buildbot? Nige Begin forwarded message: From: Date: April 8, 2009 9:02:16 PM PDT To: Subject: buildbot success in ASF Buildbot on hadoop-trunk Reply-To: core-...@hadoop.apache.org The Buildbot has finished a build of hadoop-trunk on ASF Buildbot. Full details are available at: http://ci.apache.org/builders/hadoop-trunk/builds/18 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: Build Source Stamp: unavailable Blamelist: nigel Build succeeded! sincerely, -The Buildbot
Re: buildbot success in ASF Buildbot on hadoop-trunk
Testing buildbot w/ hadoop is fine, but can you turn off emails to core-...@hadoop? All community builds for Hadoop are currently done on Hudson: http://hudson.zones.apache.org/hudson/view/Hadoop/ Cheers, Nige On Apr 8, 2009, at 9:24 PM, Gavin wrote: -Original Message- From: Nigel Daley [mailto:nda...@yahoo-inc.com] Sent: Thursday, 9 April 2009 2:13 PM To: builds@apache.org Subject: Fwd: buildbot success in ASF Buildbot on hadoop-trunk Why are we building Hadoop on Buildbot? I'm testing Buildbot, to do that I need to build something. I picked Hadoop as one of them for its lengthy time of building. It's still not right as it passes in about 4 minutes as opposed to a few hours. I was going to get around to asking for some config information so I can build it correctly. It does sound however from your tone above that you don't want Hadoop to be built on Buildbot, if that's true let me know so I don't waste any more of my time on it. Gav... Nige Begin forwarded message: From: Date: April 8, 2009 9:02:16 PM PDT To: Subject: buildbot success in ASF Buildbot on hadoop-trunk Reply-To: core-...@hadoop.apache.org The Buildbot has finished a build of hadoop-trunk on ASF Buildbot. Full details are available at: http://ci.apache.org/builders/hadoop-trunk/builds/18 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: Build Source Stamp: unavailable Blamelist: nigel Build succeeded! sincerely, -The Buildbot -- No virus found in this incoming message. Checked by AVG. Version: 7.5.557 / Virus Database: 270.11.47/2047 - Release Date: 4/8/2009 5:53 AM
Re: Welcome to the builds list!
Justin and I (Hudson admins) have been asking build owners to subscribe to infrastructure@ for this kind of info. I guess that's no longer the right place? Justin, should we change that and ask folks to signup here? Also, Gavin can we have ci.apache.org go to a page that points to the 3 CI choices that Apache projects have: Buildbot, Continuum, and Hudson? Thanks, Nige On Apr 9, 2009, at 7:20 PM, Gavin wrote: Hi All, I see some are here already, good! The main aim of this list is to be a gathering place where all Apache projects can get together with the various admins of the various build services that Apache runs. If any project wants a new service running on one of the build servers, or wants configuration changes or additions, this is the place to ask. If anyone spots any problems with any of the build servers or the projects running on them, the Jira Issue Tracker for Infra [1] is the main place to report such problems in the first instance, but can also be reported and/or discussed here. On the flip-side - this is where us admins can also ask questions of the projects themselves. One of the reasons for creating this list was so that I did not have to subscribe to 50+ mailing lists in order to ask for the occasional bit of build requirements/info when setting up new builds etc. For my part, I'm here to look after the Buildbot instance - ci.apache.org. I will no doubt be asking as well as hopefully answering and assisting. For the othe CI's such as Hudson, Continuum etc there are other folks here to help and assist with those. Feel free to ask any questions or make comments, enjoy. Gav...
Re: [hudson] Simplified user accounts
On Apr 14, 2009, at 4:06 AM, Jukka Zitting wrote: Hi, Currently the user accounts on our Hudson instances are fairly heavyweight, with separate Unix accounts associated with each Hudson account. Do we need this? Probably not. It would be easier if most users had just accounts for the Hudson web interface, and you'd only get a corresponding Unix account if you really needed to install extra software on the build servers. For example, in the projects I'm involved with it would be useful to give Hudson accounts to all or most of the committers so that they can better inspect build settings and results, trigger new builds, and set up new build jobs. None of these operations require command line access. Justin, what do you think about offering Hudson webapp logins to any committer, but only shell logins to PMC members? Nige
Re: Welcome to the builds list!
On Apr 16, 2009, at 2:25 AM, Gavin wrote: -Original Message- From: bdelacre...@gmail.com [mailto:bdelacre...@gmail.com] On Behalf Of Bertrand Delacretaz Sent: Wednesday, 15 April 2009 6:28 PM To: builds@apache.org Subject: Re: Welcome to the builds list! On Fri, Apr 10, 2009 at 1:13 PM, Gavin wrote: -Original Message- From: Nigel Daley [mailto:nda...@yahoo-inc.com] Also, Gavin can we have ci.apache.org go to a page that points to the 3 CI choices that Apache projects have: Buildbot, Continuum, and Hudson? Absolutely, ci.apache.org was created a couple of months back whilst testing buildbot and before this amalgamation of CIs list was thought about. It makes sense so I will do that soon, thanks for the suggestion It might be good to mention this list there as well. -Bertrand Yep good plan. A slight change to name to use as the main builds area to go to, ci.apache.org has been advertised/blogged/tweeted/whatever as belonging to the buildbot setup. I think it is too late now to move that to another domain. Either way, a new name was needed for either buildbot usage or the main builds area (landing page?). So, to keep inline with this list name etc, and Wendy proposed on IRC that builds.apache.org would be a better fit for the new area. If there are no objections I'd like to ask infra if they could set that up for us? (cc:d) Gav... Hmm, not crazy about this, but ok. Not sure why you didn't go with buildbot.apache.org. Taking ci.apache.org seems a little over reaching. Nige
Re: Welcome to the builds list!
Gavin, I really appreciate what you've done to setup builds@ and build.html. I think these are 2 steps that are very helpful to the Apache community. It has been very confusing for projects on where/ how to get automated builds up and running and I think these steps greatly improve that situation. Your activity here has propelled us to file INFRA-2015 to simplify Hudson's URL. Over the past 2.5 years of running Hudson (first on lucene.zones and then centralizing on hudson.zones), there have been a number of frustrating issues I've had to work through to get the resources we needed for Hudson. Now, I simply want projects to understand the options they have available. I apologize that I haven't been reading infra@ very closely over the past couple months and could have weighed in sooner with the suggestion that buildbot follow the hudson and continuum naming convention. Given you've now got ci.apache.org, I thought we agreed a couple weeks ago that the top level index.html would simply provide a pointer to the 3 build solutions. It wasn't clear why there was a change in that plan. Perhaps that's less important now that you've created build.html. Thanks, Nige On Apr 23, 2009, at 12:28 AM, Gavin wrote: -Original Message- From: Nigel Daley [mailto:nda...@yahoo-inc.com] Sent: Thursday, 23 April 2009 3:23 PM To: Gavin Cc: builds@apache.org; infrastruct...@apache.org Subject: Re: Welcome to the builds list! On Apr 16, 2009, at 2:25 AM, Gavin wrote: -Original Message- From: bdelacre...@gmail.com [mailto:bdelacre...@gmail.com] On Behalf Of Bertrand Delacretaz Sent: Wednesday, 15 April 2009 6:28 PM To: builds@apache.org Subject: Re: Welcome to the builds list! On Fri, Apr 10, 2009 at 1:13 PM, Gavin wrote: -Original Message- From: Nigel Daley [mailto:nda...@yahoo-inc.com] Also, Gavin can we have ci.apache.org go to a page that points to the 3 CI choices that Apache projects have: Buildbot, Continuum, and Hudson? Absolutely, ci.apache.org was created a couple of months back whilst testing buildbot and before this amalgamation of CIs list was thought about. It makes sense so I will do that soon, thanks for the suggestion It might be good to mention this list there as well. -Bertrand Yep good plan. A slight change to name to use as the main builds area to go to, ci.apache.org has been advertised/blogged/tweeted/whatever as belonging to the buildbot setup. I think it is too late now to move that to another domain. Either way, a new name was needed for either buildbot usage or the main builds area (landing page?). So, to keep inline with this list name etc, and Wendy proposed on IRC that builds.apache.org would be a better fit for the new area. If there are no objections I'd like to ask infra if they could set that up for us? (cc:d) Gav... Hmm, not crazy about this, but ok. You're a little behind, please see the mail entitled 'New landing page' - we didn't go with builds.apache.org either. Not sure why you didn't go with buildbot.apache.org. Taking ci.apache.org seems a little over reaching. What planet are you on? This isn't a competition. Everything I do around here you question with disdain, you continually angle at getting me to advertise Hudson wherever possible. ci.apache.org was chosen as buildbot.apache.org was rejected by other infra members as being to specific. ci.apache.org was someone else's idea. So it was enabled and I happily started setting up buildbot to use it. Then I though it would be a good idea to create a mailing list where folks could talk about buildbot and its services etc basically so I didn't have to join 200+ dev lists. Someone then said why not make it a list for all build services, so we did that. You then wanted a landing page area that people could go to to 'choose' between which build service they want, and you wanted to use ci.apache.org for it that buildbot was already using and was advertised as such, I offered builds.apache.org as an alternative, others in infra thought that creating new subdomain for one page was too much (you could call it 'a little over reaching') so I created a page called http://apache.org/dev/ builds.html (after ci.html was also rejected) - and I asked you to edit your section to suit which you have yet to do. Note that I am here for Buildbot, but I have bent over backwards to accommodate your moaning about Hudson. You haven't asked Continuum or Gump to advertise 'your' Hudson on their pages so why are you picking on me. I've helped everyone by creating a landing page, by getting this mail list set up, with the aim that every project can make use of for whichever build service they want to use, and you're still not happy. Where are your commits, what have you done?
Re: [hudson] Dead executor
Thanks Jukka! n. On May 15, 2009, at 12:23 AM, Jukka Zitting wrote: Hi, One of the two executor threads on hudson.zones.apache.org died because of: java.lang.OutOfMemoryError: GC overhead limit exceeded See [1] and [2] for more details. I'm scheduling Hudson for shutdown and will restart the master once there are no more active builds. [1] http://hudson.zones.apache.org/hudson/computers/0/executors/0/causeOfDeath [2] http://wiki.hudson-ci.org/display/HUDSON/Dead+Executor BR, Jukka Zitting
{minerva,ceres,vesta,isis,ci}.apache.org downtime
Folks, I'm informed that {minerva,ceres,vesta,isis}.apache.org (which includes ci.apache.org) will be taken down at 10am PDT on Wednesday, May 20, for 1 hour to move them to a new cabinet. This will affect Buildbot and Hudson services for that 1 hour. Speak up asap if there is a pressing issue with this time slot. FWIW, I'll put up a notice on hudson.zones that 2 of it's build slaves will be offline for the hour. Thanks, Nige
Re: {minerva,ceres,vesta,isis,ci}.apache.org downtime
This move is now complete. Looks like machines are back up. Cheers, Nige On May 18, 2009, at 11:02 PM, Nigel Daley wrote: Folks, I'm informed that {minerva,ceres,vesta,isis}.apache.org (which includes ci.apache.org) will be taken down at 10am PDT on Wednesday, May 20, for 1 hour to move them to a new cabinet. This will affect Buildbot and Hudson services for that 1 hour. Speak up asap if there is a pressing issue with this time slot. FWIW, I'll put up a notice on hudson.zones that 2 of it's build slaves will be offline for the hour. Thanks, Nige
Move builds off of Hudson master
Folks, I'd really like to move builds off the Hudson master. Here's a proposal: 1) We move the Hadoop related builds (Common, HDFS, Mapreduce, Pig, ZooKeeper, Hive, HBase, Chukwa, Avro) off to some other machines (see 4 below) 2) That would free up minerva and vesta as Ubuntu build slaves for all the other projects (which should be more than enough capacity). 3) We get permission to use the current lucene.zones slave as a Solaris build slave for those projects that really want a Solaris build (how many is that I wonder?) 4) We add a bunch more Ubuntu slaves to hudson.zones out of a pool of publicly IP'd yahoo.net machines my employer has for Hadoop related builds. Thoughts? Cheers, Nige On Jun 30, 2009, at 6:17 AM, Justin Mason wrote: On Tue, Jun 30, 2009 at 13:46, sebb wrote: On 30/06/2009, Jukka Zitting wrote: Hi, Another Tuscany-2x build [1] was stuck with lots of OOM errors and other failures in the console log. I killed the build as it was taking already almost 7 hours, which is much more than the 40 minutes used by the last successful build. [1] http://hudson.zones.apache.org/hudson/job/Tuscany-2x/116/ It looked to me as though the build was stalled, i.e. Hudson was not able to detect/recover from the situation. Is this a known problem? Is there any way to give the builds a bit more memory? It looks like Tuscany has not built successfully for a long while, so this is likely to keep happening. It's a pity that the console output does not have time-stamps, or it would be a lot easier to tell that nothing was happening. It could be the entire machine was under memory pressure, given those OOM errors. I wonder if that caused the Hudson master to get confused. --j.
Re: Hudson administrivia, build timeout
Many thanks Justin! I think there's something funky with the Bugzilla Plugin config. Our Jira issues are being broken into 2 links now. Example, MAPREDUCE-693 is broken into 1) MAPREDUCE- links to Jira issue 2) 693 links to bugzilla Nige On Jul 7, 2009, at 3:43 AM, Justin Mason wrote: I'm installing the following plugins: - Audit Trail plugin ('Keep a log of who performed particular Hudson operations, such as configuring jobs', handy in our configuration with so many users) - Bugzilla Plugin ('This plugin integrates Bugzilla into Hudson', we use bugzilla in SpamAssassin and I'm sure there are others) - Warnings Plugin ('This plugin generates the trend report for compiler warnings in the build log', looks pretty nifty!) - and I'm going to re-try the Build Timeout plugin ('This plugin allows you to automatically abort a build if it's taking too long'). We tried the build timeout before, I think, and it didn't help. But I think some of the timeouts we're seeing now are due to broken tests on some projects, and we've upgraded Hudson itself since the last try, so it's worth a retry in my opinion. I also checked the recent Hudson changelog, but nothing relevant has been implemented that would fix build hangs. Anyway, if your project has had problems with build hangs, please enable a timeout on the "Configure" page. It's about halfway down, in the "Build Environment" section -- tick the '[x] Abort the build if it's stuck' tickbox and set 'Timeout minutes' to a sane upper limit. If we run into builds of your projects timing out, we'll set this for you. ;) --j.
Re: Hudson administrivia, build timeout
On Jul 8, 2009, at 4:24 AM, Jukka Zitting wrote: Hi, On Wed, Jul 8, 2009 at 12:50 PM, Justin Mason wrote: Should we set policy for the ASF hudson instance regarding the max runtime of builds, seeing as we only have 4 build executor slots? That would be good, though we may want to allow longer builds that only run relatively seldom (e.g. weekly). +1. There may need to be exceptions to this. If I recall, the Harmony builds take a long time. Nige
Re: Hudson administrivia, build timeout
On Jul 10, 2009, at 4:37 AM, Justin Mason wrote: On Thu, Jul 9, 2009 at 14:12, Justin Mason wrote: On Thu, Jul 9, 2009 at 13:29, Jukka Zitting wrote: Hi, On Thu, Jul 9, 2009 at 2:19 PM, Justin Mason wrote: FWIW, my experience over the last few days of monitoring has been that our build backlogs on the Hudson machine are due to contention for the limited number of executors; particularly the 2 on the main instance. There are a few projects that perform 1.5-hour deployments from this. IMO we need to come up with a way to accomodate this. Should we add a "long build" lock that all builds that normally take more than say 60 minutes should synchronize on? That way we'd never have situations where two long builds block both executors at the same time. hmm, that's a good idea. WDYT, Nigel? Nigel's on holidays. I've gone ahead and done this anyway ;) There are now two Locks: "Long-running jobs on hudson.zones.apache.org" and "Long-running jobs on minerva.apache.org". (I haven't created one for the Lucene/Hadoop hosts.) Any builds that seem to be taking a very long time (fsvo "very long") will be changed to synch on those locks, in order to leave one of the executors free on those hosts for shorter builds. I've changed all the builds that seem to be consistently running for longer than 1 hour to sync on those locks. Yes, great idea. Thanks Justin! Nige
Re: Move builds off of Hudson master
FWIW, I'm still working on getting the yahoo.net machines properly imaged. Hoping to have them when I get back from vacation week of July 27. Nige On Jul 17, 2009, at 9:15 AM, Justin Mason wrote: On Thu, Jul 2, 2009 at 18:36, Nigel Daley wrote: Folks, I'd really like to move builds off the Hudson master. Here's a proposal: 1) We move the Hadoop related builds (Common, HDFS, Mapreduce, Pig, ZooKeeper, Hive, HBase, Chukwa, Avro) off to some other machines (see 4 below) 2) That would free up minerva and vesta as Ubuntu build slaves for all the other projects (which should be more than enough capacity). 3) We get permission to use the current lucene.zones slave as a Solaris build slave for those projects that really want a Solaris build (how many is that I wonder?) 4) We add a bunch more Ubuntu slaves to hudson.zones out of a pool of publicly IP'd yahoo.net machines my employer has for Hadoop related builds. So -- what's the situation with this proposal? I'm all in favour. I've been monitoring Hudson closely for the past 2 weeks, and it's clear that it's over-capacity. Even with the limiting band-aids I've been putting in place to control overlong builds, right now, the build queue has 8 pending builds waiting for a free executor, and that's been pretty much the normal situation. It needs more machines. Paul, are you still -1? --j. Cheers, Nige On Jun 30, 2009, at 6:17 AM, Justin Mason wrote: On Tue, Jun 30, 2009 at 13:46, sebb wrote: On 30/06/2009, Jukka Zitting wrote: Hi, Another Tuscany-2x build [1] was stuck with lots of OOM errors and other failures in the console log. I killed the build as it was taking already almost 7 hours, which is much more than the 40 minutes used by the last successful build. [1] http://hudson.zones.apache.org/hudson/job/Tuscany-2x/116/ It looked to me as though the build was stalled, i.e. Hudson was not able to detect/recover from the situation. Is this a known problem? Is there any way to give the builds a bit more memory? It looks like Tuscany has not built successfully for a long while, so this is likely to keep happening. It's a pity that the console output does not have time-stamps, or it would be a lot easier to tell that nothing was happening. It could be the entire machine was under memory pressure, given those OOM errors. I wonder if that caused the Hudson master to get confused. --j. -- --j.
Re: home dirs on hudson.zones.apache.org
On Jul 24, 2009, at 4:44 AM, Joe Schaefer wrote: Frankly the only thing we should be backing up on the hudson zone is the hudson-specific stuff, so just correct the backup script please. +1. - Original Message From: Tony Stevenson To: Justin Mason Cc: builds@apache.org Sent: Friday, July 24, 2009 7:25:55 AM Subject: Re: home dirs on hudson.zones.apache.org Not from the current folder no, if you give me the path of the hudson specific stuff I'll do that manually. Tony On 24 Jul 2009, at 12:24, Justin Mason wrote: On Fri, Jul 24, 2009 at 12:22, Tony Stevensonwrote: Until this is done backups will not be taken of the hudson zone. They won't? none of them? --j. Cheers, Tony Tony Stevenson t...@pc-tony.com - pct...@apache.org pct...@freenode.net - t...@caret.cam.ac.uk http://blog.pc-tony.com 1024D/51047D66 ECAF DC55 C608 5E82 0B5E 3359 C9C7 924E 5104 7D66
Re: Update hudson or what?
The slave on minerva was out of sync with the master. It has now been updated. Nige On Aug 31, 2009, at 12:45 AM, Jukka Zitting wrote: Hi, On Mon, Aug 31, 2009 at 3:00 AM, Benson Margulies> wrote: CXF hudson builds chronically report the following. Could this be just a result of a old version of hudson? I see the same exception in my build logs, but it doesn't seem to cause any harm. The d...@hudson mailing list suggests [1] that this issue is caused by the slave server running a different Hudson version than the master. [1] http://www.nabble.com/Failed-to-execute-command-Pipe.EOF(0)-td24611558.html BR, Jukka Zitting
Re: Move builds off of Hudson master
New yahoo.net Hudson slaves are now hooked up and related Hadoop builds have been moved to these slaves. This frees up vesta for any other builds. I'll follow up with another email on moving builds to vesta and off the master. Cheers, Nige On Jul 17, 2009, at 12:17 PM, Nigel Daley wrote: FWIW, I'm still working on getting the yahoo.net machines properly imaged. Hoping to have them when I get back from vacation week of July 27. Nige On Jul 17, 2009, at 9:15 AM, Justin Mason wrote: On Thu, Jul 2, 2009 at 18:36, Nigel Daley wrote: Folks, I'd really like to move builds off the Hudson master. Here's a proposal: 1) We move the Hadoop related builds (Common, HDFS, Mapreduce, Pig, ZooKeeper, Hive, HBase, Chukwa, Avro) off to some other machines (see 4 below) 2) That would free up minerva and vesta as Ubuntu build slaves for all the other projects (which should be more than enough capacity). 3) We get permission to use the current lucene.zones slave as a Solaris build slave for those projects that really want a Solaris build (how many is that I wonder?) 4) We add a bunch more Ubuntu slaves to hudson.zones out of a pool of publicly IP'd yahoo.net machines my employer has for Hadoop related builds. So -- what's the situation with this proposal? I'm all in favour. I've been monitoring Hudson closely for the past 2 weeks, and it's clear that it's over-capacity. Even with the limiting band-aids I've been putting in place to control overlong builds, right now, the build queue has 8 pending builds waiting for a free executor, and that's been pretty much the normal situation. It needs more machines. Paul, are you still -1? --j. Cheers, Nige On Jun 30, 2009, at 6:17 AM, Justin Mason wrote: On Tue, Jun 30, 2009 at 13:46, sebb wrote: On 30/06/2009, Jukka Zitting wrote: Hi, Another Tuscany-2x build [1] was stuck with lots of OOM errors and other failures in the console log. I killed the build as it was taking already almost 7 hours, which is much more than the 40 minutes used by the last successful build. [1] http://hudson.zones.apache.org/hudson/job/Tuscany-2x/116/ It looked to me as though the build was stalled, i.e. Hudson was not able to detect/recover from the situation. Is this a known problem? Is there any way to give the builds a bit more memory? It looks like Tuscany has not built successfully for a long while, so this is likely to keep happening. It's a pity that the console output does not have time-stamps, or it would be a lot easier to tell that nothing was happening. It could be the entire machine was under memory pressure, given those OOM errors. I wonder if that caused the Hudson master to get confused. --j. -- --j.
Hudson upgraded
Hudson (http://hudson.zones.apache.org/hudson) has been upgraded from 1.290 to 1.323. Change log is here: https://hudson.dev.java.net/changelog.html I have also installed the Warnings plugin and the Cobertura plugin and updated a number of installed plugins to their latest versions. Let us know if you see any problems. Cheers, Nigel
Re: [hudson] Frequent down/up notifications
I believe it gets so loaded by builds running ON the machine that the monitoring daemon can't respond. This is why we need to get builds off of hudson.zones and onto it's slave machines. Nige On Sep 23, 2009, at 7:43 AM, Jukka Zitting wrote: Hi, As a recent new Hudson admin (thanks!) I'm getting notifications about the availability of the Hudson server. Typically I get a DOWN notification that's shortly after followed by an UP notification. These come in quite often. Is someone/something actively rebooting Hudson or is there something else wrong with Hudson or the monitoring system? BR, Jukka Zitting
Re: Hudson needs a restart....
Since I had to restart Hudson, I upgrade Hudson from 1.323 to 1.327. Change log is here: https://hudson.dev.java.net/changelog.html Cheers, Nige On Oct 2, 2009, at 1:46 PM, Daniel Kulp wrote: This build had been going on for days: http://hudson.zones.apache.org/hudson/view/CXF/job/CXF-2.1.x- JDK15/241/ but not really as it's not in an executor. However, it's blocking subsequent builds from starting. I tried stopping it, but that's not working either. Can Hudson be restarted to fix that? Thanks! -- Daniel Kulp dk...@apache.org http://www.dankulp.com/blog
Re: Ceres/Buildbot down
Nigel, are you able to investigate your end as to why Ceres (a Yahoo machine) is having problems? Sure, I'll ping Rajiv and follow up on the email you sent him. Nige
Re: Hudson on Windows
If someone wants to supply a Windows Hudson slave and administer it, I'm fine with hooking it into the Hudson master (other Hudson admins can voice their opinion). Sounds like this is to be a buildbot slave -- not sure it's a good idea for a build slave to have 2 masters. Nige Hudson Admin On Oct 5, 2009, at 4:26 AM, Brett Porter wrote: On 05/10/2009, at 6:23 PM, Niklas Gustavsson wrote: Hi, there now seems to be a Windows box up and running (https://issues.apache.org/jira/browse/INFRA-1758). The (revised) intent was to run a distributed slave to vmbuild for Windows test runs. So, Hudson admins and users, would you be okay with having this set up? Are there additional interest in running builds on Windows? I suggested posting here also so that Gav could suggest whether that is the right place to run Hudson slaves or if that made sense on a different location. - Brett
Re: [hudson] Port Allocator Plugin?
Looks like the Port Allocator Plugin will only help if everyone uses it (unlikely) or your port conflicts are *within* your same job. Nige On Oct 12, 2009, at 2:29 AM, Bertrand Delacretaz wrote: Hi, Sling builds on hudson.zones.apache.org need a free HTTP port, is our default and recently started clashing with other builds. I've moved to another port (9362) for now, but it looks like the Port Allocator Plugin [1] could help here. Is anyone opposed to installing that plugin? I don't know Hudson well, no idea if that can have side effects. IIUC the plugin can be installed directly from http://hudson.zones.apache.org/hudson/pluginManager/available, it is listed there. -Bertrand [1] http://hudson.zones.apache.org/hudson/pluginManager/available
Hudson user security settings
Hudson Admins, I un-checked all the security checkboxes for each Hudson user except the "Administer" box. That's all users need. Going forward, just check the "Administer" box for new users. Nige
Re: Hudson machine utilization
Tim, the Hadoop labeled machines were not donated to ASF. Minerva, Vesta, and a couple others (used now for buildbot) were donated to ASF. I agree we should encourage folks to tie their linux builds to the "Ubuntu" label (which already exists), so both minerva and vesta get used. We should also encourage projects (spam-assasin, ftpserver, struts, vysper, xwork2) to move off of the Master hudson.zones.apache.org Nige On Oct 28, 2009, at 8:47 AM, Tim Ellison wrote: On 28/Oct/2009 15:13, Justin Mason wrote: Well, we could move more load from hudson.zones to minerva first: http://hudson.zones.apache.org/hudson/computer/%28master%29/load-statistics http://hudson.zones.apache.org/hudson/computer/minerva.apache.org%20%28Ubuntu%29/load-statistics (wow, those are good graphs!) Why do you say to do that first? At least there are times when Minerva is using both its executors. However, it looks like we could get by with half the current number of the Hadoop labeled machines without impacting anything. http://hudson.zones.apache.org/hudson/label/Hadoop/load-statistics?type=hour We certainly should embark on a program of persuading projects to schedule their jobs on both Linux and Solaris, though, to do that Maybe we can just define a useful set of labels to sets of nodes and encourage people to tie builds to them rather than specific machines. Regards, Tim On Wed, Oct 28, 2009 at 14:48, Tim Ellison wrote: Just looking at the Hudson machine utilization at the moment. There are a number of jobs that are tied to particular machines in the queue, and a number of (hadoop-labeled) machines that are committed to tied jobs only. I realize that the machines are courteously donated etc, but is the capacity being used effectively [1]? In particular, would the Hadoop jobs be impacted if we reclassified an existing slave as general usage, and more jobs as scheduable anywhere? [1] e.g. http://hudson.zones.apache.org/hudson/computer/hadoop1%20%28Ubuntu%29/load-statistics?type=hour Regards, Tim
Re: Hudson machine utilization
On Nov 5, 2009, at 2:10 PM, Tim Ellison wrote: On 05/Nov/2009 12:48, Niklas Gustavsson wrote: On Thu, Nov 5, 2009 at 12:18 AM, Nigel Daley wrote: We should also encourage projects (spam-assasin, ftpserver, struts, vysper, xwork2) to move off of the Master hudson.zones.apache.org As for FtpServer, we want our builds on Solaris (in addition to Linux on which we also builds). Would it be beneficial to provide a Hudson slave on a separate Solaris zone from where master is running? Yes, I think it would be preferable. Hudson is running on lucene.zones.apache.org but I suggest we ask infra for a dedicated Hudson zone rather than encourage individual projects to set up executors. WDYT? +1!
Re: Java updated on Minerva
Giri, can you give Tim access to Vesta? Cheers, Nige On Nov 5, 2009, at 2:05 PM, Tim Ellison wrote: FYI I have updated the installed Java's available for builds on Minerva as follows: harmony-1.5-32 -> Apache Harmony M11 32-bit harmony-1.5-64 -> Apache Harmony M11 64-bit ibm-1.4-32 -> IBM Java SDK 1.4 SR13 FP2 32-bit ibm-1.4-64 -> IBM Java SDK 1.4 SR13 FP2 64-bit ibm 1.5-32 -> IBM Java SDK 1.5 SR10 32-bit ibm 1.5-64 -> IBM Java SDK 1.5 SR10 64-bit ibm 1.6-32 -> IBM Java SDK 1.6 SR6 32-bit ibm 1.6-64 -> IBM Java SDK 1.6 SR6 64-bit latest -> Sun JDK 1.6.0 u17-b04 32-bit latest1.4-> Sun JDK 1.4.2 u19-b04 32-bit latest1.5-> Sun JDK 1.5.0 u22-b03 32-bit latest1.5-32 -> Sun JDK 1.5.0 u22-b03 32-bit latest1.5-64 -> Sun JDK 1.5.0 u22-b03 64-bit latest1.6-> Sun JDK 1.6.0 u17-b04 32-bit latest1.6-32 -> Sun JDK 1.6.0 u17-b04 32-bit latest1.6-64 -> Sun JDK 1.6.0 u17-b04 64-bit Any problems just shout. Nige: I don't have access to the other Linux machines, can you copy them across? regards, Tim
Re: Hudson machine utilization
I agree we should encourage folks to tie their linux builds to the "Ubuntu" label (which already exists), so both minerva and vesta get used. We should also encourage projects (spam-assasin, ftpserver, struts, vysper, xwork2) to move off of the Master hudson.zones.apache.org Why are minerva and vesta configured as "Leave this machine for tied jobs only"? I'd expect that setting for Master and Hadoop nodes, and let the others pick up any job. That would be preferable, but for legacy reasons Vesta and Minerva are left for tied jobs. This was because the Master was the only build node for 1.5+ years and had lots and lots of build on it when we then added Vesta and Minerva. For compatibility reasons, we set it up as is. Suggestions on how to change this now? How to migrate builds off Master? Clearly the extremes are "rip the band-aid off -- builds start failing that try to run on Master" & "big project to contact build owners and push them to migrate". Nige
Re: Hudson machine utilization
On Nov 5, 2009, at 2:13 PM, Nigel Daley wrote: On Nov 5, 2009, at 2:10 PM, Tim Ellison wrote: On 05/Nov/2009 12:48, Niklas Gustavsson wrote: On Thu, Nov 5, 2009 at 12:18 AM, Nigel Daley wrote: We should also encourage projects (spam-assasin, ftpserver, struts, vysper, xwork2) to move off of the Master hudson.zones.apache.org As for FtpServer, we want our builds on Solaris (in addition to Linux on which we also builds). Would it be beneficial to provide a Hudson slave on a separate Solaris zone from where master is running? Yes, I think it would be preferable. Hudson is running on lucene.zones.apache.org but I suggest we ask infra for a dedicated Hudson zone rather than encourage individual projects to set up executors. WDYT? +1! Gavin, do you know how we request a new Solaris build slave for Hudson? Thx, Nige
Re: Hudson machine utilization
Sent from my iPhone On Nov 16, 2009, at 1:59 AM, "Tim Ellison" wrote: On 14/Nov/2009 04:46, Nigel Daley wrote: I agree we should encourage folks to tie their linux builds to the "Ubuntu" label (which already exists), so both minerva and vesta get used. We should also encourage projects (spam-assasin, ftpserver, struts, vysper, xwork2) to move off of the Master hudson.zones.apache.org Why are minerva and vesta configured as "Leave this machine for tied jobs only"? I'd expect that setting for Master and Hadoop nodes, and let the others pick up any job. That would be preferable, but for legacy reasons Vesta and Minerva are left for tied jobs. This was because the Master was the only build node for 1.5+ years and had lots and lots of build on it when we then added Vesta and Minerva. For compatibility reasons, we set it up as is. Suggestions on how to change this now? How to migrate builds off Master? Clearly the extremes are "rip the band-aid off -- builds start failing that try to run on Master" & "big project to contact build owners and push them to migrate". Just tie jobs to master that have dependencies there, How do we determine this for the 100+ jobs? Nigel and mark it for tied jobs only, and let other jobs target labels if they have specific OS/CPU requirements. I don't think anything is particularly 'broken' at the moment is it? I was just trying to understand the current set-up, and if we ask new jobs to set up a bit differently we can prevent over burdening master while leaving spare capacity elsewhere. Regards, Tim
Re: Hudson machine utilization
On Nov 16, 2009, at 3:52 PM, Tim Ellison wrote: On 16/Nov/2009 09:53, Jukka Zitting wrote: On Mon, Nov 16, 2009 at 1:12 AM, Justin Mason wrote: On Mon, Nov 16, 2009 at 00:01, Nigel Daley wrote: How do we determine this for the 100+ jobs? I'm assuming we can ask -- all Hudson users are supposed to be subbed to infrastructure@ at least. Also we can change the main site banner Do we have an easy way to get a list of all the jobs running on (vs. being explicitly bound to [1]) master? I volunteer to contact at least some of those projects and to help them migrate their builds. [1] http://hudson.zones.apache.org/hudson/computer/(master)/ Not that I'm aware of, other than piecemeal by watching what is running there via [2]. Hopefully there's enough info in groups of build names to get a few projects at a time notified. [2] http://hudson.zones.apache.org/hudson/computer/%28master%29/builds Regards, Tim I think anything currently *unbound* gets run on the master since it's the only 'slave' that isn't reserved for tied jobs (last I looked). Nige
Re: Hudson: 2 executors in lucene (solaris10)
No, please don't do this. It's for lucene subprojects. We don't (yet) have permission from Lucene PMC to run other builds on their zone. n. On Dec 2, 2009, at 10:01 AM, Bhuvaneswaran A wrote: On Wed, Dec 2, 2009 at 1:47 PM, Bhuvaneswaran A wrote: Right now, the solaris slave lucene.zones.apache.org is configured to execute 1 job at a time. This being the only Solaris node integrated with Hudson, may i set it as 2, thus 2 Hudson jobs can run in parallel? Unless anyone object I'll make it as "2" jobs for this solaris slave, lucene.zones.apache.org, on Thu, Dec 02 2009 @ 11am IST. -- Regards, Bhuvaneswaran A www.livecipher.com GPG: 0x7A13E5B0
Re: Hudson nodes + labels
+1. On Dec 4, 2009, at 12:41 AM, Tim Ellison wrote: Here's a proposal for tweaking the Hudson nodes usage. It's not much of a change and hopefully reflects what is happening already, project specific resources run tied project jobs, and general purpose nodes are labeled with OS identifiers for those that care. Comments welcome. Label: Master : tied jobs only Label: Lucene Lucene : tied jobs only (for Lucene project) Label: Solaris10 : any job Label: Ubuntu Minerva : any job Vesta : any job Label: Hadoop hadoop1-8 : tied jobs only (for Hadoop project) Label: Win2008 : any job Jobs should be encouraged to specify node requirements as generally as possible, i.e. 'any node' before a node type (via label) before a specific node (via name). We then try to reduce load on Master by generalizing jobs away from the master node unless they need to run there, e.g for config purposes. AIUI is requested by INFRA-2360 is awaiting new disks for nyx, Gavin is looking at alternatives now. Regards, Tim
Re: Master build queue
please leave slot administration to Hudson admins. we're trying to remove slots, not add them, on the Master. nige On Dec 10, 2009, at 12:48 AM, Aristedes Maniatis wrote: I hope no one minds too much... I just increased the number of slots in the Hudson master from 2 to 3. What happened is that I've got two jobs which are both set as 'multi-configuration' jobs. Each job actually spawns a child job in order to do the work, but both the parent and the child are in the executor queue at the same time. The parent does no work, but ties up a slot. So my two jobs will sometimes get triggered at the same time. Then both parents sit in the queue and neither can spawn a child to progress any further. Adding one more slot solved this... until a third multi-config job happens to hit the queue at exactly the same time. Ideally, I'll look at ways to avoid my job having to be tied to one node. Ari Maniatis -- --> Aristedes Maniatis GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A
Re: Hudson Hadoop "Common" project - rename?
makes sense. i'll do this (as a Hadoop PMC memeber). n. On Jan 5, 2010, at 6:42 AM, sebb wrote: There are two Hudson projects for Hadoop: "Hadoop" and "Common". The latter can easily be confused with Apache Commons (though that does not yet use Hudson). Could the "Common" jobs be folded into "Hadoop"? If not, please could "Common" be renamed as "Hadoop-Common" or similar? Thanks.
Stepping down as Hudson Admin
Justin, Tim, Giri, and Jukka (Hudson Admins), I've had some changes in my personal and work life that require me to step back from some of my extra responsibilities -- unfortunately this is one of them. After 3 years, I'm stepping down as Hudson Admin for Apache and signing off these lists. Contact me directly if you need something. FWIW, Giri works with me so I'll never be that far away. Giri can serve as point-of-contact if there are any issues with the Y! donated/hosted machines (minerva, vesta, etc). It's been a fun 3 years administering the Lucene-become-Apache Hudson instance. Many thanks to Justin for joining me early on in this and the rest of you for carrying the butler forward in providing this important service! Cheers, Nige
Re: Stepping down as Hudson Admin
Guys, I've got time again to help out. Let me know if you'd like another Hudson admin. Looks like some great changes (new host, better auth) since I was here! Cheers, Nige On Jan 11, 2010, at 11:54 AM, Nigel Daley wrote: > Justin, Tim, Giri, and Jukka (Hudson Admins), > > I've had some changes in my personal and work life that require me to step > back from some of my extra responsibilities -- unfortunately this is one of > them. After 3 years, I'm stepping down as Hudson Admin for Apache and > signing off these lists. Contact me directly if you need something. FWIW, > Giri works with me so I'll never be that far away. Giri can serve as > point-of-contact if there are any issues with the Y! donated/hosted machines > (minerva, vesta, etc). > > It's been a fun 3 years administering the Lucene-become-Apache Hudson > instance. Many thanks to Justin for joining me early on in this and the rest > of you for carrying the butler forward in providing this important service! > > Cheers, > Nige >
Re: Stepping down as Hudson Admin
Thanks Niklas. Looks like I need admin role in Hudson and login access to [hudson, minerva, vesta].apache.org. Want me to file a ticket? Thanks, Nige On Dec 21, 2010, at 11:59 PM, Niklas Gustavsson wrote: > On Wed, Dec 22, 2010 at 12:44 AM, Nigel Daley wrote: >> Guys, I've got time again to help out. Let me know if you'd like another >> Hudson admin. > > You're more than welcome back :-) Since some hosts have been created > or reset, your accounts may have been lost so let us know if we should > set you up somewhere. > > /niklas
Re: PreCommit-HDFS-Build
Hudson does a terrible job of killing underlying processes when a build is aborted due to someone killing it from UI or it hitting a timeout. For these hadoop builds, it usually means that 3 or 4 processes are left lying around that can and do interfere with subsequent jobs. It's not clear to me why they are hanging, but I suspect NFS issues on these hadoop slaves. We're going to disable NFS on a couple of them later this week and see if that helps. I try to monitor for this situation regularly and properly kill builds that seem hung. Since these are on the hadoop slaves, it doesn't impact other project builds. Cheers, Nige On Jan 17, 2011, at 7:20 AM, Niklas Gustavsson wrote: > Hi > > The following build keeps getting locked up in Hudson and requires > frequent killing. Could someone have a look at it or should we disable > it for now? > > https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/ > > /niklas
Re: Questions : install Jenkins instead of Hudson
+1 for Jenkins. +1 for a redirect. Nige On Feb 3, 2011, at 7:23 AM, Matthias Wessendorf wrote: > On Thu, Feb 3, 2011 at 2:39 PM, Niklas Gustavsson > wrote: >> On Thu, Feb 3, 2011 at 2:28 PM, Olivier Lamy wrote: >>> * which code branch do we have to follow ( Jenkins or Oracle : IMHO >>> the most active is the Jenkins one) >> >> Time will tell, but I'm betting on the Jenkins branch. Jenkins has >> released a 1.396 release which contains a fix for JNPL slaves not >> reconnecting correctly. This is a problem we currently have with our >> Windows slave and that I would see fixed. So, unless anyone objects, I >> will look into upgrading to Jenkins 1.396 in the near future. > > +1 > >> >>> * do we have to rename hudson.apache.org to jenkins.apache.org >> >> In my opinion, this has low priority. The major reason for renaming >> would probably be to make a stand in the Hudson vs Jenkins situation. > > :-) why not a redirect from hudson.a.o to jenkins.a.o ? > > -Matthias > >> >> /niklas >> > > > > -- > Matthias Wessendorf > > blog: http://matthiaswessendorf.wordpress.com/ > sessions: http://www.slideshare.net/mwessendorf > twitter: http://twitter.com/mwessendorf
502 from asf hudson
Having trouble contacting https://hudson.apache.org/hudson Getting 502: Error reading from remote server On the machine, the avahi-daemon is 100% cpu. Should this be killed? PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 20934 avahi 20 0 34052 1600 1284 R 100 0.0 887:30.42 avahi-daemon Cheers, Nige
PreCommit-MAPREDUCE-Build
Just a heads up that you'll see many instances of PreCommit-MAPREDUCE-Build in the Hudson queue. This is expected as I've just turned on precommit testing for MAPREDUCE. The Hadoop build slaves should work thru most of these in the next 24 hours. They will only be run on these slaves. Thx, Nige
time to clean out hudson tomcat logs?
>From hudson.apache.org:/home/hudson/tools/tomcat/latest/logs $ ls -lhS total 1.3T -rw-rw-r-- 1 root hudson 657G Feb 26 05:40 catalina.out -rw-r--r-- 1 hudson hudson 65G Jan 4 23:58 catalina.2011-01-04.log -rw-r--r-- 1 hudson hudson 46G Feb 1 23:59 catalina.2011-02-01.log -rw-r--r-- 1 hudson hudson 45G Jan 3 23:59 catalina.2011-01-03.log -rw-r--r-- 1 hudson hudson 30G Jan 6 23:59 catalina.2011-01-06.log -rw-r--r-- 1 hudson hudson 28G Nov 24 23:59 catalina.2010-11-24.log -rw-r--r-- 1 hudson hudson 28G Nov 10 23:58 catalina.2010-11-10.log -rw-r--r-- 1 hudson hudson 28G Feb 2 23:59 catalina.2011-02-02.log -rw-r--r-- 1 hudson hudson 22G Feb 9 23:59 catalina.2011-02-09.log -rw-r--r-- 1 hudson hudson 20G Feb 11 23:59 catalina.2011-02-11.log -rw-r--r-- 1 hudson hudson 17G Nov 25 23:58 catalina.2010-11-25.log -rw-r--r-- 1 hudson hudson 12G Jan 7 23:55 catalina.2011-01-07.log -rw-r--r-- 1 hudson hudson 12G Oct 27 23:58 catalina.2010-10-27.log -rw-r--r-- 1 hudson hudson 11G Nov 12 23:59 catalina.2010-11-12.log -rw-r--r-- 1 hudson hudson 11G Oct 24 23:59 catalina.2010-10-24.log -rw-r--r-- 1 hudson hudson 10G Nov 19 23:57 catalina.2010-11-19.log
Re: Hadoop patch builds for other Projects
Hey Grant. Sorry for the late reply. I revamped the precommit testing in the fall so that it doesn't use Jira email anymore to trigger a build. The process is controlled by https://builds.apache.org/hudson/job/PreCommit-Admin/ which has some documentation up at the top of the job. You can look at the config of the job (do you have access?) to see what it's doing. Any project could use this same admin job -- you just need to ask me to add the project to the Jira filter used by the admin job (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100 ) once you have the downstream job(s) setup for your specific project. For Hadoop we have 3 downstream builds configured which also have some documentation: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/ https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/ https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/ Let me know if you have questions or can't see these job configs. Cheers, Nige On Mar 30, 2011, at 8:37 AM, Grant Ingersoll wrote: > Over in Lucene, we interested in setting up a patch testing framework for > Lucene similar to what Hadoop does. That is, when a new patch comes in, we > would like to apply it to the trunk, test it and check it if it meets our > requirements and then post a comment on the JIRA issue giving it a > preliminary vote. > > Does anyone know what the process is for setting this up? Is there a wiki or > other instructions for it anywhere? Or does, perhaps, Jenkins have a plugin > that supports this kind of thing? As I recall from talking w/ Nigel about > this before, it involves a fair amount of scripting and some mail processing > work. > > Thanks, > Grant
Re: Hadoop patch builds for other Projects
It was just temporarily disabled while we worked out some changes. It's back on now. Cheers, Nige On May 16, 2011, at 2:55 PM, Grant Ingersoll wrote: > Nigel, > > I'm finally coming back to this and starting to investigate. I see the main > Job is disabled at the moment. Is there something else that is used? > > -Grant > > On Apr 10, 2011, at 11:20 AM, Nigel Daley wrote: > >> Hey Grant. Sorry for the late reply. >> >> I revamped the precommit testing in the fall so that it doesn't use Jira >> email anymore to trigger a build. The process is controlled by >> https://builds.apache.org/hudson/job/PreCommit-Admin/ >> which has some documentation up at the top of the job. You can look at the >> config of the job (do you have access?) to see what it's doing. Any project >> could use this same admin job -- you just need to ask me to add the project >> to the Jira filter used by the admin job >> (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100 >> ) once you have the downstream job(s) setup for your specific project. For >> Hadoop we have 3 downstream builds configured which also have some >> documentation: >> https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/ >> https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/ >> https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/ >> >> Let me know if you have questions or can't see these job configs. >> >> Cheers, >> Nige >> >> On Mar 30, 2011, at 8:37 AM, Grant Ingersoll wrote: >> >>> Over in Lucene, we interested in setting up a patch testing framework for >>> Lucene similar to what Hadoop does. That is, when a new patch comes in, we >>> would like to apply it to the trunk, test it and check it if it meets our >>> requirements and then post a comment on the JIRA issue giving it a >>> preliminary vote. >>> >>> Does anyone know what the process is for setting this up? Is there a wiki >>> or other instructions for it anywhere? Or does, perhaps, Jenkins have a >>> plugin that supports this kind of thing? As I recall from talking w/ Nigel >>> about this before, it involves a fair amount of scripting and some mail >>> processing work. >>> >>> Thanks, >>> Grant >> > >
Re: weird error on jenkins "cannot assign instance of hudson.model.StreamBuildListener"
Restarted hadoop9 slave. Problem seems to be fixed. Let me know. n. On May 25, 2011, at 10:11 PM, Patrick Hunt wrote: > This is happening each time I trigger a build, any help? thanks. > > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/PreCommit-ZOOKEEPER-Build/288/console > > Patrick