It hangs for me too at the same test when doing "clean verify"
> On 23 Sep 2015, at 16:09, Stephan Ewen <se...@apache.org> wrote:
>
> Okay, will look into this is a bit today...
>
> On Wed, Sep 23, 2015 at 4:04 PM, Ufuk Celebi <u...@apache.org> wrote:
>
>> Same here.
>>
>>> On 23 Sep 2015, at 13:50, Vasiliki Kalavri <vasilikikala...@gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> It's the latest master I'm trying to build, but it still hangs.
>>> Here's the trace:
>>>
>>> -----------------------------
>>> 2015-09-23 13:48:41
>>> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed
>> mode):
>>>
>>> "Attach Listener" daemon prio=5 tid=0x00007faeb984a000 nid=0x3707 waiting
>>> on condition [0x0000000000000000]
>>> java.lang.Thread.State: RUNNABLE
>>>
>>> "Service Thread" daemon prio=5 tid=0x00007faeb9808000 nid=0x4d03 runnable
>>> [0x0000000000000000]
>>> java.lang.Thread.State: RUNNABLE
>>>
>>> "C2 CompilerThread1" daemon prio=5 tid=0x00007faebb00e800 nid=0x4b03
>>> waiting on condition [0x0000000000000000]
>>> java.lang.Thread.State: RUNNABLE
>>>
>>> "C2 CompilerThread0" daemon prio=5 tid=0x00007faebb840800 nid=0x4903
>>> waiting on condition [0x0000000000000000]
>>> java.lang.Thread.State: RUNNABLE
>>>
>>> "Signal Dispatcher" daemon prio=5 tid=0x00007faeba806800 nid=0x3d0f
>>> runnable [0x0000000000000000]
>>> java.lang.Thread.State: RUNNABLE
>>>
>>> "Finalizer" daemon prio=5 tid=0x00007faebb836800 nid=0x3303 in
>>> Object.wait() [0x000000014eff8000]
>>> java.lang.Thread.State: WAITING (on object monitor)
>>> at java.lang.Object.wait(Native Method)
>>> - waiting on <0x0000000138a84858> (a java.lang.ref.ReferenceQueue$Lock)
>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>>> - locked <0x0000000138a84858> (a java.lang.ref.ReferenceQueue$Lock)
>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>>> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
>>>
>>> "Reference Handler" daemon prio=5 tid=0x00007faebb004000 nid=0x3103 in
>>> Object.wait() [0x000000014eef5000]
>>> java.lang.Thread.State: WAITING (on object monitor)
>>> at java.lang.Object.wait(Native Method)
>>> - waiting on <0x0000000138a84470> (a java.lang.ref.Reference$Lock)
>>> at java.lang.Object.wait(Object.java:503)
>>> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>>> - locked <0x0000000138a84470> (a java.lang.ref.Reference$Lock)
>>>
>>> "main" prio=5 tid=0x00007faeb9009800 nid=0xd03 runnable
>> [0x000000010f1c0000]
>>> java.lang.Thread.State: RUNNABLE
>>> at java.net.PlainSocketImpl.socketAccept(Native Method)
>>> at
>> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>>> at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>>> at java.net.ServerSocket.accept(ServerSocket.java:498)
>>> at
>>>
>> org.apache.flink.streaming.api.functions.sink.SocketClientSinkTest.testSocketSinkRetryAccess(SocketClientSinkTest.java:315)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at
>>>
>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>>> at
>>>
>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>> at
>>>
>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>>> at
>>>
>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>>> at
>>>
>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>>> at
>>>
>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>>> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>>> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>>> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>>> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>>> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>>> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>>> at
>>>
>> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>>> at
>>>
>> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>>> at
>>>
>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>>> at
>>>
>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>>> at
>>>
>> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>>> at
>>>
>> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>>> at
>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>>>
>>> "VM Thread" prio=5 tid=0x00007faebb82e800 nid=0x2f03 runnable
>>>
>>> "GC task thread#0 (ParallelGC)" prio=5 tid=0x00007faeb9806800 nid=0x1e03
>>> runnable
>>>
>>> "GC task thread#1 (ParallelGC)" prio=5 tid=0x00007faebb000000 nid=0x2103
>>> runnable
>>>
>>> "GC task thread#2 (ParallelGC)" prio=5 tid=0x00007faebb001000 nid=0x2303
>>> runnable
>>>
>>> "GC task thread#3 (ParallelGC)" prio=5 tid=0x00007faebb001800 nid=0x2503
>>> runnable
>>>
>>> "GC task thread#4 (ParallelGC)" prio=5 tid=0x00007faebb002000 nid=0x2703
>>> runnable
>>>
>>> "GC task thread#5 (ParallelGC)" prio=5 tid=0x00007faebb002800 nid=0x2903
>>> runnable
>>>
>>> "GC task thread#6 (ParallelGC)" prio=5 tid=0x00007faebb003800 nid=0x2b03
>>> runnable
>>>
>>> "GC task thread#7 (ParallelGC)" prio=5 tid=0x00007faeb9809000 nid=0x2d03
>>> runnable
>>>
>>> "VM Periodic Task Thread" prio=5 tid=0x00007faeb980e000 nid=0x4f03
>> waiting
>>> on condition
>>>
>>> JNI global references: 195
>>>
>>>
>>>
>>>
>>> On 23 September 2015 at 13:35, Stephan Ewen <se...@apache.org> wrote:
>>>
>>>> I have pushed it, yes. If you rebase onto the latest master, it should
>>>> work.
>>>>
>>>> If you can verify that it still hangs, can you post a stack trace dump?
>>>>
>>>> Thanks,
>>>> Stephan
>>>>
>>>>
>>>> On Wed, Sep 23, 2015 at 12:37 PM, Vasiliki Kalavri <
>>>> vasilikikala...@gmail.com> wrote:
>>>>
>>>>> @Stephan, have you pushed that fix for SocketClientSinkTest? Local
>> builds
>>>>> still hang for me :S
>>>>>
>>>>> On 21 September 2015 at 22:55, Vasiliki Kalavri <
>>>> vasilikikala...@gmail.com
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> Yes, you're right. BarrierBufferMassiveRandomTest has actually
>> finished
>>>>>> :-)
>>>>>> Sorry for the confusion! I'll wait for your fix then, thanks!
>>>>>>
>>>>>> On 21 September 2015 at 22:51, Stephan Ewen <se...@apache.org> wrote:
>>>>>>
>>>>>>> I am actually very happy that it is not the
>>>>>>> "BarrierBufferMassiveRandomTest", that would be hell to debug...
>>>>>>>
>>>>>>> On Mon, Sep 21, 2015 at 10:51 PM, Stephan Ewen <se...@apache.org>
>>>>> wrote:
>>>>>>>
>>>>>>>> Ah, actually it is a different test. I think you got confused by the
>>>>>>>> sysout log, because multiple parallel tests print there (that makes
>>>> it
>>>>>>> not
>>>>>>>> always obvious which one hangs).
>>>>>>>>
>>>>>>>> The test is the "SocketClientSinkTest.testSocketSinkRetryAccess()"
>>>>> test.
>>>>>>>> You can see that by looking in which test case the "main" thread is
>>>>>>> stuck,
>>>>>>>>
>>>>>>>> This test is very unstable, but, fortunately, I made a fix 1h ago
>>>> and
>>>>> it
>>>>>>>> is being tested on Travis right now :-)
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Stephan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Sep 21, 2015 at 10:23 PM, Vasiliki Kalavri <
>>>>>>>> vasilikikala...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Locally yes.
>>>>>>>>>
>>>>>>>>> Here's the stack trace:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2015-09-21 22:22:46
>>>>>>>>> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed
>>>>>>> mode):
>>>>>>>>>
>>>>>>>>> "Attach Listener" daemon prio=5 tid=0x00007ff9d104e800 nid=0x4013
>>>>>>> waiting
>>>>>>>>> on condition [0x0000000000000000]
>>>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>>>
>>>>>>>>> "Service Thread" daemon prio=5 tid=0x00007ff9d3807000 nid=0x4c03
>>>>>>> runnable
>>>>>>>>> [0x0000000000000000]
>>>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>>>
>>>>>>>>> "C2 CompilerThread1" daemon prio=5 tid=0x00007ff9d2001000
>>>> nid=0x4a03
>>>>>>>>> waiting on condition [0x0000000000000000]
>>>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>>>
>>>>>>>>> "C2 CompilerThread0" daemon prio=5 tid=0x00007ff9d201e000
>>>> nid=0x4803
>>>>>>>>> waiting on condition [0x0000000000000000]
>>>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>>>
>>>>>>>>> "Signal Dispatcher" daemon prio=5 tid=0x00007ff9d3012800 nid=0x451b
>>>>>>>>> runnable [0x0000000000000000]
>>>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>>>
>>>>>>>>> "Finalizer" daemon prio=5 tid=0x00007ff9d4005800 nid=0x3303 in
>>>>>>>>> Object.wait() [0x000000011430d000]
>>>>>>>>> java.lang.Thread.State: WAITING (on object monitor)
>>>>>>>>> at java.lang.Object.wait(Native Method)
>>>>>>>>> - waiting on <0x00000007ef504858> (a
>>>>> java.lang.ref.ReferenceQueue$Lock)
>>>>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>>>>>>>>> - locked <0x00000007ef504858> (a java.lang.ref.ReferenceQueue$Lock)
>>>>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>>>>>>>>> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
>>>>>>>>>
>>>>>>>>> "Reference Handler" daemon prio=5 tid=0x00007ff9d480b000 nid=0x3103
>>>>> in
>>>>>>>>> Object.wait() [0x000000011420a000]
>>>>>>>>> java.lang.Thread.State: WAITING (on object monitor)
>>>>>>>>> at java.lang.Object.wait(Native Method)
>>>>>>>>> - waiting on <0x00000007ef504470> (a java.lang.ref.Reference$Lock)
>>>>>>>>> at java.lang.Object.wait(Object.java:503)
>>>>>>>>> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>>>>>>>>> - locked <0x00000007ef504470> (a java.lang.ref.Reference$Lock)
>>>>>>>>>
>>>>>>>>> "main" prio=5 tid=0x00007ff9d4800000 nid=0xd03 runnable
>>>>>>>>> [0x000000010b764000]
>>>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>>> at java.net.PlainSocketImpl.socketAccept(Native Method)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>>>>>>>>> at java.net.ServerSocket.implAccept(ServerSocket.java:530)
>>>>>>>>> at java.net.ServerSocket.accept(ServerSocket.java:498)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.apache.flink.streaming.api.functions.sink.SocketClientSinkTest.testSocketSinkRetryAccess(SocketClientSinkTest.java:315)
>>>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>>>>>>>>> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>>>>>>>>> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>>>>>>>>> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>>>>>>>>> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>>>>>>>>> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>>>>>>>>> at
>>>> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>>>>>>>>> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>>>>>>>>> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>>>>>>>>> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>>>>>>>>> at
>>>>>>>>>
>>>>>>>
>>>>>
>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
>>>>>>>>>
>>>>>>>>> "VM Thread" prio=5 tid=0x00007ff9d4005000 nid=0x2f03 runnable
>>>>>>>>>
>>>>>>>>> "GC task thread#0 (ParallelGC)" prio=5 tid=0x00007ff9d2005800
>>>>>>> nid=0x1f03
>>>>>>>>> runnable
>>>>>>>>>
>>>>>>>>> "GC task thread#1 (ParallelGC)" prio=5 tid=0x00007ff9d1800000
>>>>>>> nid=0x2103
>>>>>>>>> runnable
>>>>>>>>>
>>>>>>>>> "GC task thread#2 (ParallelGC)" prio=5 tid=0x00007ff9d1804800
>>>>>>> nid=0x2303
>>>>>>>>> runnable
>>>>>>>>>
>>>>>>>>> "GC task thread#3 (ParallelGC)" prio=5 tid=0x00007ff9d1805000
>>>>>>> nid=0x2503
>>>>>>>>> runnable
>>>>>>>>>
>>>>>>>>> "GC task thread#4 (ParallelGC)" prio=5 tid=0x00007ff9d1805800
>>>>>>> nid=0x2703
>>>>>>>>> runnable
>>>>>>>>>
>>>>>>>>> "GC task thread#5 (ParallelGC)" prio=5 tid=0x00007ff9d1806800
>>>>>>> nid=0x2903
>>>>>>>>> runnable
>>>>>>>>>
>>>>>>>>> "GC task thread#6 (ParallelGC)" prio=5 tid=0x00007ff9d1807000
>>>>>>> nid=0x2b03
>>>>>>>>> runnable
>>>>>>>>>
>>>>>>>>> "GC task thread#7 (ParallelGC)" prio=5 tid=0x00007ff9d1807800
>>>>>>> nid=0x2d03
>>>>>>>>> runnable
>>>>>>>>>
>>>>>>>>> "VM Periodic Task Thread" prio=5 tid=0x00007ff9d1006000 nid=0x4e03
>>>>>>> waiting
>>>>>>>>> on condition
>>>>>>>>>
>>>>>>>>> JNI global references: 193
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 21 September 2015 at 22:13, Stephan Ewen <se...@apache.org>
>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> This happened locally on your machine?
>>>>>>>>>>
>>>>>>>>>> Can you dump the stack-trace and post it? "jps <processid> >
>>>>>>>>>> stacktrace.txt" or so...
>>>>>>>>>>
>>>>>>>>>> On Mon, Sep 21, 2015 at 10:09 PM, Vasiliki Kalavri <
>>>>>>>>>> vasilikikala...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi squirrels,
>>>>>>>>>>>
>>>>>>>>>>> I've been meaning to merge a PR (#1520), but my local maven
>>>> build
>>>>>>> gets
>>>>>>>>>>> stuck at
>>>>>>>>>>>
>>>>>>> org.apache.flink.streaming.runtime.io.BarrierBufferMassiveRandomTest.
>>>>>>>>>>> It looks like a deadlock.. The build just hangs there and top
>>>>>>> shows no
>>>>>>>>>>> CPU/memory load. Anyone else has experienced the same? I'm on
>>>> OS
>>>>> X
>>>>>>>>> 10.10.
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>> -Vasia.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>
>>