subject:"Re\: Taskmanagers are quarantined"

Re: Taskmanagers are quarantined

2017-12-07 Thread T Obi

Hello, Thank you for much advice. Sorry for my late response. First, I made a little mistake. I set `env.java.opts.taskmanager` to enable GC log, and it cancelled to automatically set `UseG1GC` feature by accident. This means I watched log of Parallel GC. When I enabled both GC log and `UseG1GC`

Re: Taskmanagers are quarantined

2017-11-29 Thread Stephan Ewen

We also saw issues in the failure detection/quarantining with some Hadoop versions because of a subtle runtime netty version conflict. Fink 1.4 shades Flink's / Akka's Netty, in Flink 1.3 you may need to exclude the Netty dependency pulled in through Hadoop explicitly. Also, Hadoop version mismatc

Re: Taskmanagers are quarantined

2017-11-29 Thread Till Rohrmann

Hi, you could also try increasing the heartbeat timeout via `akka.watch.heartbeat.pause`. Maybe this helps to overcome the GC pauses. Cheers, Till On Wed, Nov 29, 2017 at 12:41 PM, T Obi wrote: > Warnings of Datanode appeared not in all cases of timeout. They seem > to be raised just by timeou

Re: Taskmanagers are quarantined

2017-11-29 Thread T Obi

Warnings of Datanode appeared not in all cases of timeout. They seem to be raised just by timeout while snapshotting. We output GC logs on taskmanagers and found that someone kicks System.gc() every an hour. So a full GC runs every an hour, and it takes about a minute or more in our cases... When

Re: Taskmanagers are quarantined

2017-11-27 Thread T Obi

Hello Chesnay, Thank you for answer to my rough question. Not all of taskmanagers are quarantined at a time, but each taskmanager has been quarantined at least once. We are using CDH 5.8 based on hadoop 2.6. We didn't give attention about datanodes. We will check it. However, we are also using t

Re: Taskmanagers are quarantined

2017-11-27 Thread Chesnay Schepler

Are only some taskmanagers quarantined, or all of them? Do the quarantined taskmanagers have anything in common? (are the failing ones always on certain machines; do the stacktraces reference the same hdfs datanodes) Which hadoop version are you using? From the stack-trace it appears that mul

Re: Taskmanagers are quarantined

Re: Taskmanagers are quarantined

Re: Taskmanagers are quarantined

Re: Taskmanagers are quarantined

Re: Taskmanagers are quarantined

Re: Taskmanagers are quarantined

6 matches

Site Navigation

Mail list logo

Footer information