I have observed that a Flink-on-Tez test job stalls in two cases on the Travis CI server.
https://travis-ci.org/StephanEwen/incubator-flink/jobs/62302207 It looks like a shuffle fetch is simply not continuing, but freezing. The stack traces suggest at a first glance that this is actually a Tez issue , rather than a Flink issue (all threads stuck in Tez methods), but one cannot be sure. Anyone observed something similar before?