Re: Hudson restarted

Jukka Zitting Fri, 19 Feb 2010 15:34:09 -0800

Hi,

I had to restart Hudson again as it stopped responding to HTTP
requests. I'm not sure if it was the same OOM issue we saw earlier
today. This time the log shows only simultaneous EOFExceptions from
vesta and the hadoop slaves 4, 6 and 8, and then nothing for an hour
before I forcibly restarted the server.

On Fri, Feb 19, 2010 at 7:04 PM, Justin Mason <j...@jmason.org> wrote:
> I think we could limit it by a certain number of days.  However, just
> taking a look now, and it appears most projects have sane limits -- it
> could be the sheer number of projects that causes trouble.

There are some projects that have quite a few past builds around.
Here's a quick top-ten count based on counting the builds/*/build.xml
files for each project:

    745 Cactus
    674 SpamAssassin-trunk
    492 Cayenne-trunk
    358 Chemistry-site
    340 struts2
    256 Empire-DB snapshot
    178 HttpComponents Client
    172 hupa-trunk
    168 xwork2
    146 Hadoop-Patch-h4.grid.sp2.yahoo.net

Only SpamAssassin-trunk and Hadoop-Patch-h4.grid.sp2.yahoo.net have
discard limits and even they set it pretty high (45 and 21 days). Is
there any reason to have such a long discard limit? You can always
explicitly tag builds that you want to keep around to be referenced in
issue trackers, etc. A few days (say 7 or 10 to be certain) should be
quite enough to determine whether you need to keep a build around for
longer or not.

> disk i/o on the master is pretty slow, anyway, it seems.

Yep, I've noticed that too. Is there anything we can do about that?

BR,

Jukka Zitting

Re: Hudson restarted

Reply via email to