ReduceByKeyAndWindow in Flink

2015-11-22 Thread Konstantin Knauf
Hi everyone, me again :) Let's say you have a stream, and for every window and key you compute some aggregate value, like this: DataStream.keyBy(..) .timeWindow(..) .apply(...) Now I want to get the maximum aggregate value for every window over the keys. This feels like a pr

How to pass hdp.version to flink on yarn

2015-11-22 Thread Jagat Singh
Hi, I am running example Flink program (Pivotal HDP) ./bin/flink run -m yarn-cluster -yn 2 ./examples/WordCount.jar I am getting error below. How to pass the stack.name and stack.version to the flink program. This is similar to what we give to Spark as hdp.version. Example spark.driver.extr

Watermarks as "process completion" flags

2015-11-22 Thread Anton Polyakov
Hi I am very new to Flink and in fact never used it. My task (which I currently solve using home grown Redis-based solution) is quite simple - I have a system which produces some events (trades, it is a financial system) and computational chain which computes some measure accumulatively over th

Re: placement preferences for streaming jobs

2015-11-22 Thread Stephan Ewen
Hi Stefania! I think there is no hook for that right now. If I understand you correctly, assuming you run YARN or so, you want to give the sources a set of hostnames, and when scheduling, the sources have preferences for those nodes. Within a dataflow program (job), Flink will attempt to co-locat