Hey Nicholas,

Yeah so Jenkins has it's own timeout mechanism and it will just kill the
entire build after 120 minutes. But since run-tests is sitting in the
middle of the tests, it can't actually post a failure message.

I think run-tests-jenkins should just wrap the call to run-tests in a call
in its own timeout. It might be possible to just use this:

http://linux.die.net/man/1/timeout

- Patrick


On Fri, Aug 15, 2014 at 1:31 PM, Nicholas Chammas <
[email protected]> wrote:

> OK, I've captured this in SPARK-3076
> <https://issues.apache.org/jira/browse/SPARK-3076>.
>
> Patrick,
>
> Is the problem that this run-tests
> <https://github.com/apache/spark/blob/0afe5cb65a195d2f14e8dfcefdbec5dac023651f/dev/run-tests-jenkins#L151>
>  step
> times out, and that is currently not handled gracefully? To be more
> specific, it hangs for 120 minutes, times out, but the parent script for
> some reason is also terminated. Does that sound right?
>
> Nick
>
>
> On Fri, Aug 15, 2014 at 3:33 PM, Shivaram Venkataraman <
> [email protected]> wrote:
>
>> Jenkins runs for this PR https://github.com/apache/spark/pull/1960 timed
>> out without notification. The relevant Jenkins logs are at
>>
>>
>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18588/consoleFull
>>
>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18592/consoleFull
>>
>> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18597/consoleFull
>>
>>
>> On Fri, Aug 15, 2014 at 11:44 AM, Nicholas Chammas <
>> [email protected]> wrote:
>>
>>> Shivaram,
>>>
>>> Can you point us to an example of that happening? The Jenkins console
>>> output, that is.
>>>
>>> Nick
>>>
>>>
>>> On Fri, Aug 15, 2014 at 2:28 PM, Shivaram Venkataraman <
>>> [email protected]> wrote:
>>>
>>>> Also I think Jenkins doesn't post build timeouts to github. Is there
>>>> anyway
>>>> we can fix that ?
>>>> On Aug 15, 2014 9:04 AM, "Patrick Wendell" <[email protected]> wrote:
>>>>
>>>> > Hi All,
>>>> >
>>>> > I noticed that all PR tests run overnight had failed due to timeouts.
>>>> The
>>>> > patch that updates the netty shuffle I believe somehow inflated to the
>>>> > build time significantly. That patch had been tested, but one change
>>>> was
>>>> > made before it was merged that was not tested.
>>>> >
>>>> > I've reverted the patch for now to see if it brings the build times
>>>> back
>>>> > down.
>>>> >
>>>> > - Patrick
>>>> >
>>>>
>>>
>>>
>>
>

Reply via email to