Re: Using Flink Streaming to write to multiple output files in HDFS

2015-11-09 Thread Nyamath Ulla Khan
Hi Andra, You could find very intersting example for Flink streaming and with Kafka (input/Output). https://flink.apache.org/news/2015/02/09/streaming-example.html. http://dataartisans.github.io/flink-training/exercises/ ( Contains most the different Operator Example) http://dataartisans.github.i

Re: Using Flink Streaming to write to multiple output files in HDFS

2015-11-09 Thread Robert Metzger
Hey Andra, were you able to answer your questions from Aljoschas and Fabians links? Flink's streaming file sink is quite unique (compared to Flume) because it supports exactly-once semantics. Also, the performance compared to Storm is probably much better, so you can save a lot of resources. On

Re: Using Flink Streaming to write to multiple output files in HDFS

2015-10-21 Thread Fabian Hueske
There are also training slides and programming exercises (incl. reference solutions) for the DataStream API at --> http://dataartisans.github.io/flink-training/ Cheers, Fabian 2015-10-21 14:03 GMT+02:00 Aljoscha Krettek : > Hi, > the documentation has a guide about the Streaming API: > > https:

Re: Using Flink Streaming to write to multiple output files in HDFS

2015-10-21 Thread Aljoscha Krettek
Hi, the documentation has a guide about the Streaming API: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html This also contains a section about the rolling (HDFS) FileSystem sink: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#hadoop

Using Flink Streaming to write to multiple output files in HDFS

2015-10-21 Thread Andra Lungu
Hey guys, Long time, no see :). I recently started a new job and it involves performing a set of real-time data analytics using Apache Kafka, Storm and Flume. What happens, on a very high level, is that set of signals is collected, stored into a Kafka topic and then Storm is used to filter certai