Hi Fabian, Not sure if this answers your question, here is the stack I got when debugging the combine and datasource operators when the job got stuck:
"DataSource (at main(BatchTest.java:28) (org.apache.flink.api.java.io.TupleCsvInputFormat)) (1/8)" at java.lang.Object.wait(Object.java) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:163) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:93) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.runtime.operators.util.metrics.CountingCollector.collect(CountingCollector.java:35) at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:163) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at java.lang.Thread.run(Thread.java:745) "Combine (GroupReduce at first(DataSet.java:573)) (1/8)" at java.lang.Object.wait(Object.java) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:163) at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133) at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:93) at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65) at org.apache.flink.api.java.functions.FirstReducer.reduce(FirstReducer.java:41) at org.apache.flink.api.java.functions.FirstReducer.combine(FirstReducer.java:52) at org.apache.flink.runtime.operators.AllGroupReduceDriver.run(AllGroupReduceDriver.java:152) at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:486) at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:351) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at java.lang.Thread.run(Thread.java:745) Best, Yassine 2016-09-23 11:28 GMT+02:00 Yassine MARZOUGUI <y.marzou...@mindlytix.com>: > Hi Fabian, > > Is it different from the output I already sent? (see attached file). If > yes, how can I obtain the stacktrace of the job programmatically? Thanks. > > Best, > Yassine > > 2016-09-23 10:55 GMT+02:00 Fabian Hueske <fhue...@gmail.com>: > >> Hi Yassine, can you share a stacktrace of the job when it got stuck? >> >> Thanks, Fabian >> >> 2016-09-22 14:03 GMT+02:00 Yassine MARZOUGUI <y.marzou...@mindlytix.com>: >> >>> The input splits are correctly assgined. I noticed that whenever the job >>> is stuck, that is because the task *Combine (GroupReduce at >>> first(DataSet.java:573)) *keeps RUNNING and never switches to FINISHED. >>> I tried to debug the program at the *first(100), *but I couldn't do >>> much. I attahced the full DEBUG output. >>> >>> 2016-09-22 12:10 GMT+02:00 Robert Metzger <rmetz...@apache.org>: >>> >>>> Can you try running with DEBUG logging level? >>>> Then you should see if input splits are assigned. >>>> Also, you could try to use a debugger to see what's going on. >>>> >>>> On Mon, Sep 19, 2016 at 2:04 PM, Yassine MARZOUGUI < >>>> y.marzou...@mindlytix.com> wrote: >>>> >>>>> Hi Chensey, >>>>> >>>>> I am running Flink 1.1.2, and using NetBeans 8.1. >>>>> I made a screencast reproducing the problem here: >>>>> http://recordit.co/P53OnFokN4 <http://recordit.co/VRBpBlb51A>. >>>>> >>>>> Best, >>>>> Yassine >>>>> >>>>> >>>>> 2016-09-19 10:04 GMT+02:00 Chesnay Schepler <ches...@apache.org>: >>>>> >>>>>> No, I can't recall that i had this happen to me. >>>>>> >>>>>> I would enable logging and try again, as well as checking whether the >>>>>> second job is actually running through the WebInterface. >>>>>> >>>>>> If you tell me your NetBeans version i can try to reproduce it. >>>>>> >>>>>> Also, which version of Flink are you using? >>>>>> >>>>>> >>>>>> On 19.09.2016 07:45, Aljoscha Krettek wrote: >>>>>> >>>>>> Hmm, this sound like it could be IDE/Windows specific, unfortunately >>>>>> I don't have access to a windows machine. I'll loop in Chesnay how is >>>>>> using >>>>>> windows. >>>>>> >>>>>> Chesnay, do you maybe have an idea what could be the problem? Have >>>>>> you ever encountered this? >>>>>> >>>>>> On Sat, 17 Sep 2016 at 15:30 Yassine MARZOUGUI < >>>>>> y.marzou...@mindlytix.com> wrote: >>>>>> >>>>>>> Hi Aljoscha, >>>>>>> >>>>>>> Thanks for your response. By the first time I mean I hit run from >>>>>>> the IDE (I am using Netbeans on Windows) the first time after building >>>>>>> the >>>>>>> program. If then I stop it and run it again (without rebuidling) It is >>>>>>> stuck in the state RUNNING. Sometimes I have to rebuild it, or close the >>>>>>> IDE to be able to get an output. The behaviour is random, maybe it's >>>>>>> related to the IDE or the OS and not necessarily Flink itself. >>>>>>> >>>>>>> On Sep 17, 2016 15:16, "Aljoscha Krettek" <aljos...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> when is the "first time". It seems you have tried this repeatedly >>>>>>>> so what differentiates a "first time" from the other times? Are you >>>>>>>> closing >>>>>>>> your IDE in-between or do you mean running the job a second time >>>>>>>> within the >>>>>>>> same program? >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Aljoscha >>>>>>>> >>>>>>>> On Fri, 9 Sep 2016 at 16:40 Yassine MARZOUGUI < >>>>>>>> y.marzou...@mindlytix.com> wrote: >>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> When I run the following batch job inside the IDE for the first >>>>>>>>> time, it outputs results and switches to FINISHED, but when I run it >>>>>>>>> again >>>>>>>>> it is stuck in the state RUNNING. The csv file size is 160 MB. What >>>>>>>>> could >>>>>>>>> be the reason for this behaviour? >>>>>>>>> >>>>>>>>> public class BatchJob { >>>>>>>>> >>>>>>>>> public static void main(String[] args) throws Exception { >>>>>>>>> final ExecutionEnvironment env = >>>>>>>>> ExecutionEnvironment.getExecutionEnvironment(); >>>>>>>>> >>>>>>>>> env.readCsvFile("dump.csv") >>>>>>>>> .ignoreFirstLine() >>>>>>>>> .fieldDelimiter(";") >>>>>>>>> .includeFields("111000") >>>>>>>>> .types(String.class, String.class, String.class) >>>>>>>>> .first(100) >>>>>>>>> .print(); >>>>>>>>> >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Yassine >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >>> >> >