SLF4j logging system gets clobbered?

2017-10-18 Thread Jared Stehler
I’m having an issue where I’ve got logging setup and functioning for my flink-mesos deployment, and works fine up to a point (the same point every time) where it seems to fall back to “defaults” and loses all of my configured filtering. 2017-10-11 21:37:17.454 [flink-akka.actor.default-dispatch

Re: Task Manager was lost/killed due to full GC

2017-10-18 Thread Fabian Hueske
Thanks for the heads-up and explaining how you resolve the issue! Best, Fabian 2017-10-18 3:50 GMT+02:00 ShB : > I just wanted to leave an update about this issue, for someone else who > might > come across it. The problem was with memory, but it was disk memory and not > heap/off-heap memory. Y

Re: Off heap memory issue

2017-10-18 Thread Javier Lopez
Hi Robert, Sorry to reply this late. We did a lot of tests, trying to identify if the problem was in our custom sources/sinks. We figured out that none of our custom components is causing this problem. We came up with a small test, and realized that the Flink nodes run out of non-heap JVM memory a

Re: Off heap memory issue

2017-10-18 Thread Flavio Pompermaier
We also faced the same problem, but the number of jobs we can run before restarting the cluster depends on the volume of the data to shuffle around the network. We even had problems with a single job and in order to avoid OOM issues we had to put some configuration to limit Netty memory usage, i.e.

Garbage collection concerns with Task Manager memory

2017-10-18 Thread Marchant, Hayden
I read in the Flink documentation that the TaskManager runs all tasks within its own JVM, and that the recommendation is to set the taskmanager.heap.mb to be as much as is available on the server. I have a very large server with 192GB so thinking of giving most of it to the Task Manager. I reca

Re: Maven release

2017-10-18 Thread Gary Yao
Hi Biswajit, The distribution management configuration can be found in the parent pom of flink-parent: org.apache apache 18 apache.releases.https Apache Release Distribution Repository https://repository.apache.org/service/local/staging/deploy/maven2 apache.sna

Re: Garbage collection concerns with Task Manager memory

2017-10-18 Thread Kien Truong
Hi, Yes, GC is still a major concern. Even G1 has a hard time dealing with >64GB heap in our experience. To mitigate, we run multiple TMs with smaller heap per machine, and use RocksDBStateBackend. Best regards, Kien On 10/18/2017 4:40 PM, Marchant, Hayden wrote: I read in the Flink docu

Re: Stumped writing to KafkaJSONSink

2017-10-18 Thread Fabian Hueske
Hi Kenny, this look almost correct. The Table class has a method writeToSink(TableSink) that should address your use case (so the same as yours but without the TableEnvironment argument). Does that work for you? If not what kind of error and error message do you get? Best, Fabian 2017-10-18 1:2

Re: Off heap memory issue

2017-10-18 Thread Kien Truong
Hi, We saw a similar issue in one of our job due to ByteBuffer memory leak[1]. We fixed it using the solution in the article, setting -Djdk.nio.maxCachedBufferSize This variable is available for Java > 8u102 Best regards, Kien [1]http://www.evanjones.ca/java-bytebuffer-leak.html On 10/18

Re: start-cluster.sh not working in HA mode

2017-10-18 Thread Fabian Hueske
Hi Hayden, I tried to reproduce the problem you described and followed the HA setup instructions of the documentation [1]. For me the instructions worked and start-cluster.sh started two JobManagers on my local machine (master contained two localhost entries). The bash scripts tend to be a bit fr

Re: Stumped writing to KafkaJSONSink

2017-10-18 Thread Kenny Gorman
Yep we hung out and got it working. I should have replied sooner! Thx for the reply. -kg > On Oct 18, 2017, at 7:06 AM, Fabian Hueske wrote: > > Hi Kenny, > > this look almost correct. > The Table class has a method writeToSink(TableSink) that should address your > use case (so the same as

Re: Stumped writing to KafkaJSONSink

2017-10-18 Thread Fabian Hueske
No worries :-) Thanks for the notice. 2017-10-18 15:07 GMT+02:00 Kenny Gorman : > Yep we hung out and got it working. I should have replied sooner! Thx for > the reply. > > -kg > > On Oct 18, 2017, at 7:06 AM, Fabian Hueske wrote: > > Hi Kenny, > > this look almost correct. > The Table class has

RE: GROUP BY TUMBLE on ROW range

2017-10-18 Thread Stefano Bortoli
Great, thanks for the explanation. I noticed now indeed that the examples are for the table API. I believe over window is sufficient for the purpose right now, was just curious. Best, Stefano From: Fabian Hueske [mailto:fhue...@gmail.com] Sent: Tuesday, October 17, 2017 9:24 PM To: Stefano Bort

Problems with taskmanagers in Mesos Cluster

2017-10-18 Thread Manuel Montesino
Hi, We have deployed a Mesos cluster with Marathon, we deploy flink sessions through marathon with multiple taskmanagers configured. Some times in previous stages usually change configuration on marathon json about memory and other stuff, but when redeploy the flink session the jobmanagers stop