New yahoo.net Hudson slaves are now hooked up and related Hadoop
builds have been moved to these slaves. This frees up vesta for any
other builds. I'll follow up with another email on moving builds to
vesta and off the master.
Cheers,
Nige
On Jul 17, 2009, at 12:17 PM, Nigel Daley wrote:
FWIW, I'm still working on getting the yahoo.net machines properly
imaged. Hoping to have them when I get back from vacation week of
July 27.
Nige
On Jul 17, 2009, at 9:15 AM, Justin Mason wrote:
On Thu, Jul 2, 2009 at 18:36, Nigel Daley<ni...@apache.org> wrote:
Folks,
I'd really like to move builds off the Hudson master. Here's a
proposal:
1) We move the Hadoop related builds (Common, HDFS, Mapreduce, Pig,
ZooKeeper, Hive, HBase, Chukwa, Avro) off to some other machines
(see 4
below)
2) That would free up minerva and vesta as Ubuntu build slaves for
all the
other projects (which should be more than enough capacity).
3) We get permission to use the current lucene.zones slave as a
Solaris
build slave for those projects that really want a Solaris build
(how many is
that I wonder?)
4) We add a bunch more Ubuntu slaves to hudson.zones out of a pool
of
publicly IP'd yahoo.net machines my employer has for Hadoop
related builds.
So -- what's the situation with this proposal?
I'm all in favour. I've been monitoring Hudson closely for the
past 2
weeks, and it's clear that it's over-capacity. Even with the limiting
band-aids I've been putting in place to control overlong builds,
right
now, the build queue has 8 pending builds waiting for a free
executor,
and that's been pretty much the normal situation. It needs more
machines.
Paul, are you still -1?
--j.
Cheers,
Nige
On Jun 30, 2009, at 6:17 AM, Justin Mason wrote:
On Tue, Jun 30, 2009 at 13:46, sebb<seb...@gmail.com> wrote:
On 30/06/2009, Jukka Zitting <jukka.zitt...@gmail.com> wrote:
Hi,
Another Tuscany-2x build [1] was stuck with lots of OOM errors
and
other failures in the console log. I killed the build as it was
taking
already almost 7 hours, which is much more than the 40 minutes
used by
the last successful build.
[1] http://hudson.zones.apache.org/hudson/job/Tuscany-2x/116/
It looked to me as though the build was stalled, i.e. Hudson was
not
able to detect/recover from the situation. Is this a known
problem?
Is there any way to give the builds a bit more memory?
It looks like Tuscany has not built successfully for a long
while, so
this is likely to keep happening.
It's a pity that the console output does not have time-stamps,
or it
would be a lot easier to tell that nothing was happening.
It could be the entire machine was under memory pressure, given
those
OOM errors. I wonder if that caused the Hudson master to get
confused.
--j.
--
--j.