The Cassandra dtest builds take ~12 hours. The unit tests over an hour
We are looking into parallelising these, but work hasn't started on that yet.

We recently parallelised a number of the unit test builds, and added pipeline 
builds, and subsequently builds have been crashing with full disks. Yesterday 
it was cassandra12 and cassandra15. I wiped their Cassandra build workspaces 
and they are running again. I also reduce the log retention rate from 50 to 25 
builds (which I didn't want to do, we'd rather kept more if we knew we could).

But if this was from other projects it would be great to know. 
Is there some way we can improve the visibility into disk usage on the build 
nodes? How full they are? And what projects are taking up space? Does jenkins 
provide this info? Or could infra dump a `du …` report somewhere?

regards,
Mick


On Thu, 23 Jan 2020, at 07:26, Martin Stockhammer wrote:
> Hi,
> 
> our average build time for the main archiva build job is about 1 hour on 
> the apache build servers.
> We have a timeout of 2h configured in our pipeline.
> 
> So, one hour is too short for us and we would appreciate, if you 
> consider to increase your kill timeout to some higher value.
> 
> Regards
> 
> Martin
> 
> 
> On 23.01.20 01:55, Chris Lambertus wrote:
> > Folks,
> > 
> > Over the last week or so we have received many reports of broken builds due 
> > to nodes out of resources. As noted in INFRA-19751, builds appear to fail 
> > yet continue to run, using up all available resources on a build node.
> > 
> > I will be implementing a system to kill jenkins processes based on duration 
> > of run. My initial feeling is to kill any single process which has been 
> > running for longer than one hour real-time.
> > 
> > I will also be implementing a system to kill/purge all docker containers 
> > which have been running for over 6 hours.
> > 
> > 
> > I am seeking input on these time limits, especially from those with larger 
> > builds. Is there any reason a -single process- or a docker container should 
> > run for more than 1 or 6 hours respectively?
> > 
> > Thanks,
> > Chris
> > ASF Infra
> > 
> > 
>

Reply via email to