Re: Memory Leak - Flink / RocksDB ?

2017-07-25 Thread Kien Truong
Hi, What're your task manager memory configuration ? Can you post the TaskManager's log ? Regards, Kien On 7/25/2017 8:41 PM, Shashwat Rastogi wrote: Hi, We have several Flink jobs, all of which reads data from Kafka do some aggregations (over sliding windows of (1d, 1h)) and writes data

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-25 Thread Kien Truong
Hi Pedro, As long as there's no OutOfMemoryError/long garbage collection pause, there's nothing to worry about keeping memory allocated. The memory should be garbage-collected by the JVM when necessary. Regards, Kien On 7/25/2017 10:53 PM, PedroMrChaves wrote: Hello, Thank you for the rep

Re: Purging Late stream data

2017-07-25 Thread Kien Truong
Hi, One method you can use is using a ProcessFunction. In the process function, you get the timer service through the function context, which can then be used to schedule a task to clean up late data. Check out the docs for ProcessFunction https://ci.apache.org/projects/flink/flink-docs-rel

Re: a lot of connections in state "CLOSE_WAIT"

2017-07-25 Thread XiangWei Huang
hi, The browser i am using is Google Chrome with version 59.0.3071.115 and the issue persists when i tried Firefox. Regards, XiangWei > 在 2017年7月25日,17:48,Chesnay Schepler 写道: > > Hello, > > Could you tell us which browser you are using, including the version? > (and maybe try out if the iss

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-25 Thread Tzu-Li (Gordon) Tai
Hi, I couldn’t seem to reproduce this. Taking another look at your description, one thing I spotted was that your Kafka broker installation versions are 0.10.1.0, while the Kafka consumer uses Kafka clients of version 0.10.0.1 (by default, as shown in your logs). I’m wondering whether or not th

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-25 Thread prashantnayak
Hi Stephan Unclear on what you mean by the "trash" option... thought that was only available for command line hadoop and not applicable for API, which is what Flink uses? If there is a configuration for the Flink/Hadoop connector, please let me know. Also, one additional thing about S3 S3 su

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-25 Thread prashantnayak
Thanks Stephan We can confirm that turning off RocksDB incremental checkpointing seems to help and greatly reduces the number of files (from tens of thousands to low thousands). We still see that there is a inflection point when running > 50 jobs causes the appmaster to stop deleting files from S

Purging Late stream data

2017-07-25 Thread G.S.Vijay Raajaa
Hi, I am having 3 streams which is being merged from a union of kafka topics on a given timestamp. The problem I am facing is that, if there is a delay in one of the stream and when the data in that particular stream arrives at a later point in time, the merge happens in a delayed fashion. The wa

Re: How can I set charset for flink sql?

2017-07-25 Thread Ted Yu
Logged CALCITE-1903 for this bug. FYI On Tue, Jul 25, 2017 at 6:39 PM, 程骥 wrote: > OK,thanks for remind me. > > My sql like this(contain a Chinese word): > > SELECT > 'HIGH' AS LEVEL, > 'Firewall uplink bandwidth exception:greater than 1' AS content, > `system.process.username`, > `system.p

?????? How can I set charset for flink sql??

2017-07-25 Thread ????
OK??thanks for remind me. My sql like this(contain a Chinese word): SELECT 'HIGH' AS LEVEL, 'Firewall uplink bandwidth exception:greater than 1' AS content, `system.process.username`, `system.process.memory.rss.bytes` FROM test WHERE `system.p

Re: Connecting to remote task manager failed

2017-07-25 Thread Charith Wickramarachchi
I did some more digging. It seems the CoGroup operation failed in one of the workers. But I do not face this issue when running other tasks. Thanks, Charith On Tue, Jul 25, 2017 at 2:06 PM, Charith Wickramarachchi < charith.dhanus...@gmail.com> wrote: > Hi All, > > I m getting an exception when

Connecting to remote task manager failed

2017-07-25 Thread Charith Wickramarachchi
Hi All, I m getting an exception when running a Gelly task using Pregel model. It complains that the remote task manager might be lost. But task managers seem to be active based on the flink dashboard. Also, other tasks run fine without an issue. Following is the summary of exception trace. I

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-25 Thread Bowen Li
Hi Stephan, Making Flink's S3 integration independent of Hadoop is great. We've been running into a lot of Hadoop configuration trouble when trying to enabling Flink checkpointing with S3 on AWS EMR. Is there any concrete plan or tickets created yet for tracking? Thanks, Bowen On Mon, J

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-25 Thread PedroMrChaves
Hello, Thank you for the reply. The problem is not that the task manager uses a lot of memory, the problem is that every time I cancel and re-submit the Job the task manager does not release the previously allocated memory. Regards, Pedro Chaves. - Best Regards, Pedro Chaves -- View this

Re: How can I set charset for flink sql?

2017-07-25 Thread Nico Kruber
Please, for the sake of making your email searchable, do not post stack traces as screenshots but rather text into your email. On Tuesday, 25 July 2017 12:18:56 CEST 程骥 wrote: > My sql like this(contain a Chinese word) > > Get exception when I submit the job to cluster. > > > > Is there anyon

Re: Unable to make mapWithState work correctly

2017-07-25 Thread Nico Kruber
Hi Victor, from a quick look at your code, I think, you set up everything just fine (I'm not too familiar with Scala though) but the problem is probably somewhere else: As [1] states (a bit hidden maybe), checkpoints are only used to recover from failures, e.g. if you run your job on 2 task mana

Re: Gelly PageRank implementations in 1.2 to 1.3

2017-07-25 Thread Kaepke, Marc
Hi Greg, it seems that it doesn’t matter with the vertex „3“ with no degree. I removed these vertex in the graph and in a second test of my input file. The ranking order is still different, and I guess wrong. Furthermore is the sum of all ranks not 1. It depends on the beta-parameter. E.g. a bet

Re: Memory Leak - Flink / RocksDB ?

2017-07-25 Thread vinay patil
Hi Shashwat, Are you specifying the RocksDBStateBackend from the flink-conf.yaml or from code? If you are specifying it from the code, you can try using PredefinedOptions.FLASH_SSD_OPTIMIZED Also, you can try setting incremental checkpointing ( this feature is in Flink 1.3.0) If the above does n

Re: How can I set charset for flink sql?

2017-07-25 Thread Timo Walther
Hi, currently Flink does not support this charset in a LIKE expression. This is due to a limitation in the Apache Calcite library. Maybe you can open an issue there. The easiest solution for this is to implement your own scalar function, that does a `string.contains("")`. Here you can

Memory Leak - Flink / RocksDB ?

2017-07-25 Thread Shashwat Rastogi
Hi, We have several Flink jobs, all of which reads data from Kafka do some aggregations (over sliding windows of (1d, 1h)) and writes data to Cassandra. Something like : ``` DataStream lines = env.addSource(new FlinkKafkaConsumer010( … )); DataStream events = lines.map(line -> parse(line)); Dat

Unable to make mapWithState work correctly

2017-07-25 Thread Victor Godoy Poluceno
Hi, I am trying to write a simple streaming program to count values from a Kafka topic in a fault tolerant manner, like this : val config: Configuration = new Configuration() config.setString(ConfigConstants.STATE_BACKEND,

Re: Flink shaded table API

2017-07-25 Thread nragon
Well, it might be scala conflits on my client side since no job is sent to stream environment. When i remove print schema or explain the job in sent and executed properly on flink side. Thanks -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink shaded table API

2017-07-25 Thread Chesnay Schepler
This sounds similar to https://issues.apache.org/jira/browse/FLINK-6173. On 25.07.2017 13:07, nragon wrote: Let's see if I can sample this :P. First i'm reading from kafka. FlinkKafkaConsumer010 consumer = KafkaSource.consumer(this.zookeeper, this.sourceName, 5); consum

Re: Flink shaded table API

2017-07-25 Thread nragon
Let's see if I can sample this :P. First i'm reading from kafka. FlinkKafkaConsumer010 consumer = KafkaSource.consumer(this.zookeeper, this.sourceName, 5); consumer.assignTimestampsAndWatermarks(KafkaTimestampExtractor.extractor()); Then, converting my object(Data

Re: Flink shaded table API

2017-07-25 Thread Fabian Hueske
Can you post a small example to reproduce the issue? Also, which version are you using? Thanks, Fabian 2017-07-25 12:44 GMT+02:00 nragon : > I'm getting the following error when using table API: > > Caused by: java.lang.NoClassDefFoundError: > org/apache/flink/shaded/calcite/com/fasterxml/jacks

Re: How can i merge more than one flink stream

2017-07-25 Thread Fabian Hueske
If you are using tumbling time-windows, then the timestamp of the aggregated records emitted from the window are all the maximum timestamp that would have been accepted for the window. For example, if you have an hourly tumbling window, the window from 2 to 3 o'clock would include all timestamps be

Flink shaded table API

2017-07-25 Thread nragon
I'm getting the following error when using table API: Caused by: java.lang.NoClassDefFoundError: org/apache/flink/shaded/calcite/com/fasterxml/jackson/databind/ObjectMapper at org.apache.flink.table.explain.PlanJsonParser.getSqlExecutionPlan(PlanJsonParser.java:32) ... 18 common fr

How can I set charset for flink sql??

2017-07-25 Thread ????
My sql like this(contain a Chinese word) Get exception when I submit the job to cluster. Is there anyone tell me how to deal with it,thanks!

Re: How can i merge more than one flink stream

2017-07-25 Thread Jone Zhang
“Consistent” means that in the same time window, the timestamps of the three streams should be kept the same. In my application, I am trying to build an online learning system. I need to join the streams from 1 and 2 on the SAME timestamp to form training samples which will be fed to some online l

Re: a lot of connections in state "CLOSE_WAIT"

2017-07-25 Thread Chesnay Schepler
Hello, Could you tell us which browser you are using, including the version? (and maybe try out if the issue persists with a different one) Regards, Chesnay On 25.07.2017 05:20, XiangWei Huang wrote: hi, Sorry for replying so late. I have met this issue again and the list is constantly keep g

Re: Connect more than two streams

2017-07-25 Thread Jonas Gröger
Hello Govindarajan, one way to merge multiple streams is to union them. You can do that with the union operator described at [1]. Is that what you are looking for? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/datastream_api.html -- Jonas Am Mo, 24. Jul 2017, um 23:18, sc

Is joined stream WindowedStream?

2017-07-25 Thread xie wei
Hello Flink, one question about join operator: Flink Docu said "Join two data streams on a given key and a common window",= does "common window" mean that the joined stream is windowedStream? If it is normal datastream (not windowed), could the stream be windowed aga= in with different window