I saw this failure also multiple times now. This is another case of it: https://travis-ci.org/apache/flink/jobs/62767646
I think the Tez community is currently voting on a new release. Maybe we should see if this one fixes the issue. Otherwise we should ask on their list. On Wed, May 13, 2015 at 9:35 AM, Aljoscha Krettek <aljos...@apache.org> wrote: > I think I saw it once, yes. But dismissed it as a fluke. > > On Wed, May 13, 2015 at 1:13 AM, Stephan Ewen <se...@apache.org> wrote: > > I have observed that a Flink-on-Tez test job stalls in two cases on the > > Travis CI server. > > > > https://travis-ci.org/StephanEwen/incubator-flink/jobs/62302207 > > > > It looks like a shuffle fetch is simply not continuing, but freezing. The > > stack traces suggest at a first glance that this is actually a Tez issue > , > > rather than a Flink issue (all threads stuck in Tez methods), but one > > cannot be sure. > > > > Anyone observed something similar before? >