*Bam. <https://github.com/apache/spark/pull/1974#issuecomment-52368527>*
On Fri, Aug 15, 2014 at 5:04 PM, Patrick Wendell <[email protected]> wrote: > Yeah I was thinking something like that. Basically we should just have a > variable for the timeout and I can make sure it's under the configured > Jenkins time. > > > On Fri, Aug 15, 2014 at 1:55 PM, Nicholas Chammas < > [email protected]> wrote: > >> So 2 hours is a hard cap on how long a build can run. Okie doke. >> >> Perhaps then I'll wrap the run-tests step as you suggest and limit it to >> 100 minutes or something, and cleanly report if it times out. >> >> Sound good? >> >> >> On Fri, Aug 15, 2014 at 4:43 PM, Patrick Wendell <[email protected]> >> wrote: >> >>> Hey Nicholas, >>> >>> Yeah so Jenkins has it's own timeout mechanism and it will just kill the >>> entire build after 120 minutes. But since run-tests is sitting in the >>> middle of the tests, it can't actually post a failure message. >>> >>> I think run-tests-jenkins should just wrap the call to run-tests in a >>> call in its own timeout. It might be possible to just use this: >>> >>> http://linux.die.net/man/1/timeout >>> >>> - Patrick >>> >>> >>> On Fri, Aug 15, 2014 at 1:31 PM, Nicholas Chammas < >>> [email protected]> wrote: >>> >>>> OK, I've captured this in SPARK-3076 >>>> <https://issues.apache.org/jira/browse/SPARK-3076>. >>>> >>>> Patrick, >>>> >>>> Is the problem that this run-tests >>>> <https://github.com/apache/spark/blob/0afe5cb65a195d2f14e8dfcefdbec5dac023651f/dev/run-tests-jenkins#L151> >>>> step >>>> times out, and that is currently not handled gracefully? To be more >>>> specific, it hangs for 120 minutes, times out, but the parent script for >>>> some reason is also terminated. Does that sound right? >>>> >>>> Nick >>>> >>>> >>>> On Fri, Aug 15, 2014 at 3:33 PM, Shivaram Venkataraman < >>>> [email protected]> wrote: >>>> >>>>> Jenkins runs for this PR https://github.com/apache/spark/pull/1960 >>>>> timed out without notification. The relevant Jenkins logs are at >>>>> >>>>> >>>>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18588/consoleFull >>>>> >>>>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18592/consoleFull >>>>> >>>>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18597/consoleFull >>>>> >>>>> >>>>> On Fri, Aug 15, 2014 at 11:44 AM, Nicholas Chammas < >>>>> [email protected]> wrote: >>>>> >>>>>> Shivaram, >>>>>> >>>>>> Can you point us to an example of that happening? The Jenkins console >>>>>> output, that is. >>>>>> >>>>>> Nick >>>>>> >>>>>> >>>>>> On Fri, Aug 15, 2014 at 2:28 PM, Shivaram Venkataraman < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Also I think Jenkins doesn't post build timeouts to github. Is there >>>>>>> anyway >>>>>>> we can fix that ? >>>>>>> On Aug 15, 2014 9:04 AM, "Patrick Wendell" <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> > Hi All, >>>>>>> > >>>>>>> > I noticed that all PR tests run overnight had failed due to >>>>>>> timeouts. The >>>>>>> > patch that updates the netty shuffle I believe somehow inflated to >>>>>>> the >>>>>>> > build time significantly. That patch had been tested, but one >>>>>>> change was >>>>>>> > made before it was merged that was not tested. >>>>>>> > >>>>>>> > I've reverted the patch for now to see if it brings the build >>>>>>> times back >>>>>>> > down. >>>>>>> > >>>>>>> > - Patrick >>>>>>> > >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >
