Re: Failing to recover once checkpoint fails

2017-10-05 Thread Aljoscha Krettek
Hi Vishal, I think you're right! And thanks for looking into this so deeply. With your last mail your basically saying, that the checkpoint could not be restored because your HDFS was temporarily down. If Flink had not deleted that checkpoint it might have been possible to restore it at a late

Re: At end of complex parallel flow, how to force end step with parallel=1?

2017-10-05 Thread Garrett Barton
Fabian, Turns out I was wrong. My flow was in fact running in two separate jobs due to me trying to use a local variable calculated by ...distinct().count() in a downstream flow. The second flow indeed set parallelism correctly! Thank you for the help. :) On Wed, Oct 4, 2017 at 8:01 AM, Fabia

Re: Issue with CEP library

2017-10-05 Thread Ajay Krishna
Hi Kostas, I noticed that you commented on FLINK-7549 and FLINK-7606. I was monitoring both these JIRAs. I was always using time characteristic as event time like you had suggested but I continue to see patterns not getting detected. Could you help shed more light on this ? I had shared some cod

Re: Failing to recover once checkpoint fails

2017-10-05 Thread Vishal Santoshi
I think this is the offending piece. There is a catch all Exception, which IMHO should understand a recoverable exception from an unrecoverable on. try { completedCheckpoint = retrieveCompletedCheckpoint(checkpointStateHandle); if (completedCheckpoint != null) { completedCheckpoints.add(completed

Re: Failing to recover once checkpoint fails

2017-10-05 Thread Vishal Santoshi
So this is the issue and tell us that it is wrong. ZK had some state ( backed by hdfs ) that referred to a checkpoint ( the same exact last successful checkpoint that was successful before NN screwed us ). When the JM tried to recreate the state and b'coz NN was down failed to retrieve the CHK hand

Re: Failing to recover once checkpoint fails

2017-10-05 Thread Vishal Santoshi
Also note that the zookeeper recovery did ( sadly on the same hdfs cluster ) also showed the same behavior. It had the pointers to the chk point ( I think that is what it does, keeps metadata of where the checkpoint etc ) . It too decided to keep the recovery file from the failed state. -rw-

Re: kafka consumer parallelism

2017-10-05 Thread r. r.
Thanks a lot, Carst! I hadn't realized that Best regards > Оригинално писмо >От: Carst Tankink ctank...@bol.com >Относно: Re: kafka consumer parallelism >До: "r. r." >Изпратено на: 05.10.2017 09:04 > Hi, > > The latter (map will be spread out if you rebal

Re: Failing to recover once checkpoint fails

2017-10-05 Thread Vishal Santoshi
Another thing I noted was this thing drwxr-xr-x - root hadoop 0 2017-10-04 13:54 /flink-checkpoints/prod/c4af8dfa864e2f9a51764de9f0725b39/chk-44286 drwxr-xr-x - root hadoop 0 2017-10-05 09:15 /flink-checkpoints/prod/c4af8dfa864e2f9a51764de9f0725b39/chk-45428 Generally what

Re: Failing to recover once checkpoint fails

2017-10-05 Thread Vishal Santoshi
Hello Fabian, First of all congratulations on this fabulous framework. I have worked with GDF and though GDF has some natural pluses Flink's state management is far more advanced. With kafka as a source it negates issues GDF has ( GDF integration with pub/sub is organic and th

Re: Calculating metrics real time

2017-10-05 Thread Fabian Hueske
That depends on the state backend [1] that you are using. If you use the RocksDBStateBackend, state is written to RocksDB which persists to disk. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/ops/state_backends.html 2017-10-05 14:41 GMT+02:00 Rahul Raj : > Thanks for the respon

Re: Calculating metrics real time

2017-10-05 Thread Rahul Raj
Thanks for the response. So, i am guessing windows in flink will store the records in memory before processing them. Correct? Rahul Raj On Oct 5, 2017 17:50, "Fabian Hueske" wrote: > Hi, > > I'd suggest to have a look at the window operators [1]. For example a > tumbling window of 1 minute can

Re: Kafka Producer writeToKafkaWithTimestamps; name, uid, parallelism fails

2017-10-05 Thread Fabian Hueske
Hi Yunus, thanks for reporting this problem. I opened the JIRA issue FLINK-7764 [1] for this. Thanks, Fabian [1] https://issues.apache.org/jira/browse/FLINK-7764 2017-10-05 13:38 GMT+02:00 Yunus Olgun : > Hi, > > I am using Flink 1.3.2. When I try to use KafkaProducer with timestamps it > fail

Re: Calculating metrics real time

2017-10-05 Thread Fabian Hueske
Hi, I'd suggest to have a look at the window operators [1]. For example a tumbling window of 1 minute can be used to compute metrics every minute. Flink's window operators are very extensible and you can implement very custom logic if the predefined windows don't match your use case. In any case,

Kafka Producer writeToKafkaWithTimestamps; name, uid, parallelism fails

2017-10-05 Thread Yunus Olgun
Hi, I am using Flink 1.3.2. When I try to use KafkaProducer with timestamps it fails to set name, uid or parallelism. It uses default values. ——— FlinkKafkaProducer010.FlinkKafkaProducer010Configuration producer = FlinkKafkaProducer010 .writeToKafkaWithTimestamps(stream, topicName, schema

Calculating metrics real time

2017-10-05 Thread Rahul Raj
Hi, I have to calculate some complicated metrics like click through rate , click value rate and conversions on real time data using flink. But I am not sure what functionality of flink should I use to program this because it involves collection of some records in memory for certain time may be 1 m

Re: Failing to recover once checkpoint fails

2017-10-05 Thread Fabian Hueske
Hi Vishal, window operators are always stateful because the operator needs to remember previously received events (WindowFunction) or intermediate results (ReduceFunction). Given the program you described, a checkpoint should include the Kafka consumer offset and the state of the window operator.

Re: Savepoints and migrating value state data types

2017-10-05 Thread mrooding
Gordon Thanks for the detailed response. I have verified your assumption and that is, unfortunately, the case. I also looked into creating a custom Kryo serializer but I got stuck on serializing arrays of complex types. It seems like this isn't trivial at all with Kryo. As an alternative, I've b