
Over the last week or so we have received many reports of broken builds due to 
nodes out of resources. As noted in INFRA-19751, builds appear to fail yet 
continue to run, using up all available resources on a build node.

I will be implementing a system to kill jenkins processes based on duration of 
run. My initial feeling is to kill any single process which has been running 
for longer than one hour real-time. 

I will also be implementing a system to kill/purge all docker containers which 
have been running for over 6 hours. 

I am seeking input on these time limits, especially from those with larger 
builds. Is there any reason a -single process- or a docker container should run 
for more than 1 or 6 hours respectively?

ASF Infra

Reply via email to