Thanks Yang that helped. Sent from my iPhone
> On Aug 24, 2020, at 8:44 PM, Yang Wang <danrtsey...@gmail.com> wrote: > > > I think at least you have two different exceptions. > > > java.lang.Exception: Container released on a *lost* node > This usually means a Yarn nodemanager is down. So all the containers running > on this node will be > released and rescheduled to a new one. If you want to figure out the root > cause, you need to check > the Yarn nodemanager logs. > > > java.lang.OutOfMemoryError: Metaspace > Could you check the value of flink configuration > "taskmanager.memory.jvm-metaspace.size"? If it is > too small, increasing it will help. Usually, 256m is enough for most cases. > > > Best, > Yang > > Vijayendra Yadav <contact....@gmail.com> 于2020年8月25日周二 上午4:51写道: >> Another one - >> >> Exception in thread "FileCache shutdown hook" >> Exception: java.lang.OutOfMemoryError thrown from the >> UncaughtExceptionHandler in thread "FileCache shutdown hook" >> >> Regards, >> Vijay >> >>> On Mon, Aug 24, 2020 at 1:04 PM Vijayendra Yadav <contact....@gmail.com> >>> wrote: >>> Actually got this message in rolledover container logs: >>> >>> [org.slf4j.impl.Log4jLoggerFactory] >>> Exception in thread "cb-timer-1-1" java.lang.OutOfMemoryError: Metaspace >>> Exception in thread "Thread-16" java.lang.OutOfMemoryError: Metaspace >>> Exception in thread "TransientBlobCache shutdown hook" >>> java.lang.OutOfMemoryError: Metaspace >>> Exception in thread "FileChannelManagerImpl-io shutdown hook" >>> java.lang.OutOfMemoryError: Metaspace >>> Exception in thread "Kafka Fetcher for Source: flink-kafka-consumer -> Map >>> -> Filter -> Map -> Sink: s3-sink-raw (2/3)" java.lang.OutOfMemoryError: >>> Metaspace >>> Exception in thread "FileCache shutdown hook" java.lang.OutOfMemoryError: >>> Metaspace >>> Any suggestions on how to fix it ? >>> >>> >>>> On Mon, Aug 24, 2020 at 12:53 PM Vijayendra Yadav <contact....@gmail.com> >>>> wrote: >>>> Hi Team, >>>> >>>> Running a flink job on Yarn, I am trying to make connections to couchbase >>>> DB in one of my map functions in Flink Streaming job. But my task manager >>>> containers keep failing >>>> and keep assigning new containers and not giving me an opportunity to get >>>> any useful logs. >>>> >>>> val cluster = Cluster.connect("host", "user", "pwd") >>>> val bucket = cluster.bucket("bucket") >>>> val collection = bucket.defaultCollection >>>> >>>> Only thing I see is yarn exception: >>>> >>>> java.lang.Exception: Container released on a *lost* node >>>> at >>>> org.apache.flink.yarn.YarnResourceManager.lambda$onContainersCompleted$0(YarnResourceManager.java:343) >>>> at >>>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397) >>>> at >>>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190) >>>> at >>>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74) >>>> at >>>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152) >>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) >>>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) >>>> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) >>>> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) >>>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) >>>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) >>>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) >>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517) >>>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) >>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) >>>> at akka.actor.ActorCell.invoke(ActorCell.scala:561) >>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) >>>> at akka.dispatch.Mailbox.run(Mailbox.scala:225) >>>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235) >>>> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >>>> at >>>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >>>> at >>>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >>>> at >>>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >>>> >>>> >>>> >>>> Could you please provide any insight on how to get logs. And why a simple >>>> connection will not work. >>>> >>>> Note: it works in my local system yarn. >>>> >>>> Regards, >>>> Vijay