I looked into the problem and the problem is a deserialization issue on the TaskManager side. Somehow the system is not capable to send InputSplits around whose classes are contained in the user code jars. A similar issue was already observed by Fabian in FLINK-1438. I used his test program and the same error as with Thiazi-parser occurred. Furthermore, I could solve the problem by putting the flink-hadoop-compatibility jar in the lib folder of Flink. This is the same behavior as Fabian described.
On Thu, Jan 29, 2015 at 9:14 AM, Till Rohrmann <trohrm...@apache.org> wrote: > Yes actually the timeouts should not really matter. However, an exception > in the InputSplitAssigner should happen in the actor thread and thus cause > the actor to stop. This should be logged by the supervisor. > > I just checked and the method InputSplitAssigner.getNextInputSplit is not > supposed to throw any checked exceptions. > > On Thu, Jan 29, 2015 at 6:38 AM, Stephan Ewen <se...@apache.org> wrote: > >> @Till: The default timeouts are high enough that such a timeout should >> actually not occur, right? Increasing the timeouts cannot really be the >> issue. >> >> Might it be something different? What happens if there is an error in the >> code that produces the input split? Is that properly handled, or is the >> receiver simply not ever getting an answer. Could that be what happened? >> > >