Questions

2017-07-26 Thread Егор Литвиненко
Hi Is there a way to process mapping errors in Flink? For example when string is valid double write in one table, otherwise in another? If not, what problems you see reffered to this opportunity and if I will make PR, where I should start to implenent this feature? I saw Tuple1, 2, etc. Many meth

Re: is there ways to enable checkpoint from flink-conf.yaml?

2017-07-26 Thread Ivan
one more thing I found is a little confusing for Restart Strategies document is /The default restart strategy is set via Flink’s configuration file flink-conf.yaml. The configuration parameter restart-st

Re: FlinkKafkaConsumer subscribes to partitions in restoredState only.

2017-07-26 Thread ninad
Got it. Thanks Gordon. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FlinkKafkaConsumer-subscribes-to-partitions-in-restoredState-only-tp14233p14484.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble

CEP condition expression and its event consuming strategy

2017-07-26 Thread Chao Wang
Hi, I have two questions regarding the use of the Flink CEP library (flink-cep_2.11:1.3.1), as follows: 1. I'd like to know how to use the API to express "emit event C in the presence of events A and B, with no restriction on the arriving order of A and B"? I've tried by creating two pattern

Re: Gelly PageRank implementations in 1.2 to 1.3

2017-07-26 Thread Greg Hogan
Hi Marc, FLINK-7273 updates Gelly's PageRank to optionally include zero-degree vertices (the performance cost looks to be significant so this is disabled by default). I created FLINK-7277 to work on a weighted PageRank implementation. The greater challenge is integrating weighted graphs into t

Watermarking and Timestamp on Kafka stream union

2017-07-26 Thread G.S.Vijay Raajaa
HI, I am having a union of 3 kafka topic stream, i am joining them by a timestamp field. I would like to order the join by timestamp. How do I assign a watermark and extract timestamp from a union stream? Regards, Vijay Raajaa GS

Distributed reading and parsing of protobuf files from S3 in Apache Flink

2017-07-26 Thread ShB
I'm working with Apache Flink on reading, parsing and processing data from S3. I'm using the DataSet API, as my data is bounded and doesn't need streaming semantics. My data is on S3 in binary protobuf format in the form of a large number of timestamped files. Each of these files have to be read,

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread prashantnayak
Thanks Stefan. +1 on "I am even considering packing this list as a plain text file with the checkpoint, to make this more transparent for users" that is def. more Ops friendly... Thanks Prashant -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread Stefan Richter
Hi, your concerns about deleting files when using incremental checkpoints is very valid. Deleting empty checkpoint folders is obviously ok. As for files, I have recently added some additional logging to the checkpointing mechanism to report the files referenced in the last checkpoint. I will tr

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread prashantnayak
Thanks Stephan and Stefan We're looking forward to this patch in 1.3.2 We will use a patched version depending upon when 1.3.2 is going to be available. We're also implementing a cron job to remove orphaned/older completedCheckpoint files per your recommendations.. one caveat with a job like th

Re: Connecting to remote task manager failed

2017-07-26 Thread Charith Wickramarachchi
Hi Fabian, I see the following exceptions in worker logs (Exception trace is similar to the one I attached above). I m wondering if its a network configuration issue because it refers to a 127.0.1.1 address. Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerg

Re: is there ways to enable checkpoint from flink-conf.yaml?

2017-07-26 Thread Chesnay Schepler
There is no option that enables checkpointing for all jobs. If you have control over///all/ jobs, as a *hack*, you could load the configuration manually (I don't think it is exposed through the execution environment) using "GlobalConfiguration.loadConfiguration()", manually check it for whatev

is there ways to enable checkpoint from flink-conf.yaml?

2017-07-26 Thread Ivan
Hi , Flink users we are using Flink as the runtime of our beam jobs which works great, recently we want to enable restart strategy in our flink cluster, from the document I see restart strategy will only

Class not found when deserializing

2017-07-26 Thread Paolo Cristofanelli
Hi, I am trying to write and read in a Kafka topic a user-defined class (that implements serializable, and all the fields are serializable). Everything works fine when I am executing the program in the IDE or with the mvn exec command. When I try to execute the program in standalone mode I get the

RE: Logback user class

2017-07-26 Thread Nuno Rafael Goncalves
Even though I have executed the same code with intelliJ and works fine? The only difference is that intelliJ is using log4j and the other application is using logback. Moreover, the snippet is quite simple, it does not reference any user class other than flink's Does flink uses user loaded log im

Re: FlinkKafkaConsumer010 - Memory Issue

2017-07-26 Thread Stephan Ewen
Hi Pedro! Can you make a heap dump after a few re-submissions and share a screenshot or so showing which memory piles up? Is it the buffer memory from the Kafka Producer or Consumer? Stephan On Wed, Jul 26, 2017 at 8:11 AM, Kien Truong wrote: > Hi Pedro, > > As long as there's no OutOfMemoryE

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread Stephan Ewen
Stefan is correct in pointing out the lines to remove. FYI: We are trying to add this patch to the upcoming 1.3.2 release which should also help: https://github.com/apache/flink/pull/4397 On Wed, Jul 26, 2017 at 9:48 AM, Stefan Richter wrote: > Hi, > > I think Stephan was talking about removing

Re: Logback user class

2017-07-26 Thread nragon
I've changed that line and compiled it into lib/. Error remains. I'm running a local custer with start-local.sh -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Logback-user-class-tp14464p14469.html Sent from the Apache Flink User Mailing List

Re: Logback user class

2017-07-26 Thread Ted Yu
Please take a look at FLINK-6767. On Wed, Jul 26, 2017 at 3:53 AM, nragon wrote: > Hi, > > I executing the following snippet on two different environments. > > > StreamExecutionEnvironment streamEnv = > StreamExecutionEnvironment.createRemoteEnvironment("x", 6123); > streamE

Re: How can I set charset for flink sql?

2017-07-26 Thread Fabian Hueske
As Timo proposed, I would implement a Scalar user-defined function which returns a boolean and use that instead of LIKE. Have a look here [1]. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/udfs.html#scalar-functions 2017-07-26 3:47 GMT+02:00 Ted Yu : >

Re: Connecting to remote task manager failed

2017-07-26 Thread Fabian Hueske
Hi, Do you have an exception for the the CoGroup failure? Best, Fabian 2017-07-26 3:32 GMT+02:00 Charith Wickramarachchi < charith.dhanus...@gmail.com>: > I did some more digging. It seems the CoGroup operation failed in one of > the workers. But I do not face this issue when running other task

Re: Purging Late stream data

2017-07-26 Thread G.S.Vijay Raajaa
Sure, Let me try that out. On the same note, does BoundedOutOfOrdernessTimestampExtractor Serve the purpose too? Regards, Vijay Raajaa GS On Wed, Jul 26, 2017 at 9:22 AM, Kien Truong wrote: > Hi, > > One method you can use is using a ProcessFunction. > > In the process function, you get the ti

Logback user class

2017-07-26 Thread nragon
Hi, I executing the following snippet on two different environments. StreamExecutionEnvironment streamEnv = StreamExecutionEnvironment.createRemoteEnvironment("x", 6123); streamEnv.setParallelism(10); StreamTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(

Re: a lot of connections in state "CLOSE_WAIT"

2017-07-26 Thread Chesnay Schepler
So this /only/ happens when you select a metric? Without a selected metric everything works fine? Are the metrics you selected shown correctly? Did you modify the "jobmanager.web.refresh-interval" setting? (maybe check the flink-conf-yaml for the current setting) On 26.07.2017 04:57, XiangWe

Re: Memory Leak - Flink / RocksDB ?

2017-07-26 Thread Shashwat Rastogi
Hi Vinay, @Vinay : I am setting RocksDBStateBackend from the code, not from flink-conf.yaml. I am currently trying out the configurations that you have shared. I’ll let you know how they perform. Thank you so much for your help. However, were you able to figure out what exactly is going wrong

Re: Flink shaded table API

2017-07-26 Thread nragon
Hi Fabian, It's a dependency problem between our current libraries and flink's. Thanks -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-shaded-table-API-tp14432p14461.html Sent from the Apache Flink User Mailing List archive. mailing l

Re: Flink shaded table API

2017-07-26 Thread Fabian Hueske
Hi, I tried to reproduce the problem, but did not get the exception you posted. I implemented a job similar to yours and ran it both from the IDE and using the flink submission client using the 1.3.1 binaries. In both cases, the schema was correctly printed. Do you get the exception in the IDE or

Re: S3 recovery and checkpoint directories exhibit explosive growth

2017-07-26 Thread Stefan Richter
Hi, I think Stephan was talking about removing this part: try { FileUtils.deletePathIfEmpty(fs, filePath.getParent()); } catch (Exception ignored) {} This part should *NOT* be removed: fs.delete(filePath, false); The reason is that the first is only an