New yahoo.net Hudson slaves are now hooked up and related Hadoop builds have been moved to these slaves. This frees up vesta for any other builds. I'll follow up with another email on moving builds to vesta and off the master.

Cheers,
Nige

On Jul 17, 2009, at 12:17 PM, Nigel Daley wrote:

FWIW, I'm still working on getting the yahoo.net machines properly imaged. Hoping to have them when I get back from vacation week of July 27.

Nige

On Jul 17, 2009, at 9:15 AM, Justin Mason wrote:

On Thu, Jul 2, 2009 at 18:36, Nigel Daley<ni...@apache.org> wrote:
Folks,

I'd really like to move builds off the Hudson master. Here's a proposal:

1) We move the Hadoop related builds (Common, HDFS, Mapreduce, Pig,
ZooKeeper, Hive, HBase, Chukwa, Avro) off to some other machines (see 4
below)

2) That would free up minerva and vesta as Ubuntu build slaves for all the
other projects (which should be more than enough capacity).

3) We get permission to use the current lucene.zones slave as a Solaris build slave for those projects that really want a Solaris build (how many is
that I wonder?)

4) We add a bunch more Ubuntu slaves to hudson.zones out of a pool of publicly IP'd yahoo.net machines my employer has for Hadoop related builds.

So -- what's the situation with this proposal?

I'm all in favour. I've been monitoring Hudson closely for the past 2
weeks, and it's clear that it's over-capacity. Even with the limiting
band-aids I've been putting in place to control overlong builds, right now, the build queue has 8 pending builds waiting for a free executor,
and that's been pretty much the normal situation.  It needs more
machines.

Paul, are you still -1?

--j.


Cheers,
Nige


On Jun 30, 2009, at 6:17 AM, Justin Mason wrote:

On Tue, Jun 30, 2009 at 13:46, sebb<seb...@gmail.com> wrote:

On 30/06/2009, Jukka Zitting <jukka.zitt...@gmail.com> wrote:

Hi,

Another Tuscany-2x build [1] was stuck with lots of OOM errors and other failures in the console log. I killed the build as it was taking already almost 7 hours, which is much more than the 40 minutes used by
the last successful build.

[1] http://hudson.zones.apache.org/hudson/job/Tuscany-2x/116/

It looked to me as though the build was stalled, i.e. Hudson was not able to detect/recover from the situation. Is this a known problem?

Is there any way to give the builds a bit more memory?

It looks like Tuscany has not built successfully for a long while, so
this is likely to keep happening.

It's a pity that the console output does not have time-stamps, or it
would be a lot easier to tell that nothing was happening.

It could be the entire machine was under memory pressure, given those
OOM errors.  I wonder if that caused the Hudson master to get
confused.

--j.





--
--j.


Reply via email to