Hey Nicholas, Yeah so Jenkins has it's own timeout mechanism and it will just kill the entire build after 120 minutes. But since run-tests is sitting in the middle of the tests, it can't actually post a failure message.
I think run-tests-jenkins should just wrap the call to run-tests in a call in its own timeout. It might be possible to just use this: http://linux.die.net/man/1/timeout - Patrick On Fri, Aug 15, 2014 at 1:31 PM, Nicholas Chammas < [email protected]> wrote: > OK, I've captured this in SPARK-3076 > <https://issues.apache.org/jira/browse/SPARK-3076>. > > Patrick, > > Is the problem that this run-tests > <https://github.com/apache/spark/blob/0afe5cb65a195d2f14e8dfcefdbec5dac023651f/dev/run-tests-jenkins#L151> > step > times out, and that is currently not handled gracefully? To be more > specific, it hangs for 120 minutes, times out, but the parent script for > some reason is also terminated. Does that sound right? > > Nick > > > On Fri, Aug 15, 2014 at 3:33 PM, Shivaram Venkataraman < > [email protected]> wrote: > >> Jenkins runs for this PR https://github.com/apache/spark/pull/1960 timed >> out without notification. The relevant Jenkins logs are at >> >> >> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18588/consoleFull >> >> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18592/consoleFull >> >> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18597/consoleFull >> >> >> On Fri, Aug 15, 2014 at 11:44 AM, Nicholas Chammas < >> [email protected]> wrote: >> >>> Shivaram, >>> >>> Can you point us to an example of that happening? The Jenkins console >>> output, that is. >>> >>> Nick >>> >>> >>> On Fri, Aug 15, 2014 at 2:28 PM, Shivaram Venkataraman < >>> [email protected]> wrote: >>> >>>> Also I think Jenkins doesn't post build timeouts to github. Is there >>>> anyway >>>> we can fix that ? >>>> On Aug 15, 2014 9:04 AM, "Patrick Wendell" <[email protected]> wrote: >>>> >>>> > Hi All, >>>> > >>>> > I noticed that all PR tests run overnight had failed due to timeouts. >>>> The >>>> > patch that updates the netty shuffle I believe somehow inflated to the >>>> > build time significantly. That patch had been tested, but one change >>>> was >>>> > made before it was merged that was not tested. >>>> > >>>> > I've reverted the patch for now to see if it brings the build times >>>> back >>>> > down. >>>> > >>>> > - Patrick >>>> > >>>> >>> >>> >> >
