Re: Monitoring single-run job statistics

2016-01-04 Thread Till Rohrmann
Hi Filip, at the moment it is not possible to retrieve the job statistics after the job has finished with flink run -m yarn-cluster. The reason is that the YARN cluster is only alive as long as the job is executed. Thus, I would recommend you to execute your jobs with a long running Flink cluster

Re: 2015: A Year in Review for Apache Flink

2016-01-04 Thread Till Rohrmann
Happy New Year :-) Hope everyone had a great start into the new year. On Thu, Dec 31, 2015 at 12:57 PM, Slim Baltagi wrote: > Happy New Year to you and your families! > Let’s make 2016 the year of Flink: General Availability, faster growth, > wider industry adoption, … > Slim Baltagi > Chicago,

Re: [ANNOUNCE] Introducing Apache Flink Taiwan User Group - Flink.tw

2016-01-04 Thread Liang Chen
Awesome! Whether also inclue simple chinese or only Traditional chinese? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ANNOUNCE-Introducing-Apache-Flink-Taiwan-User-Group-Flink-tw-tp4136p4147.html Sent from the Apache Flink User Mailing Li

Re: Scala API and sources with timestamp

2016-01-04 Thread Till Rohrmann
Hi Don, yes that's exactly how you use an anonymous function as a source function. Cheers, Till On Tue, Dec 22, 2015 at 3:16 PM, Don Frascuchon wrote: > Hello, > > There is a way for define a EventTimeSourceFunction with anonymous > functions from the scala api? Like that: > > env.addSource

Re: Problem to show logs in task managers

2016-01-04 Thread Ana M. Martinez
Hi Till, Sorry for the delay (Xmas break). I have activated log aggregation on flink-conf.yaml with yarn.log-aggregation-enable: true (as I can’t find a yarn-site.xml). But the command yarn logs -applicationId application_1451903796996_0008 gives me the following output: INFO client.RMProxy: C

Flink on EMR Question

2016-01-04 Thread Chiwan Park
Hi All, I have some problems using Flink on Amazon EMR cluster. Q1. Sometimes, jobmanager container still exists after destroying yarn session by pressing Ctrl+C. In that case, Flink YARN app seems exited correctly in YARN RM dashboard. But there is a running container in the dashboard. From lo

Re: Problem to show logs in task managers

2016-01-04 Thread Till Rohrmann
I think the YARN application has to be finished in order for the logs to be accessible. Judging from you commands, you’re starting a long running YARN application running Flink with ./bin/yarn-session.sh -n 1 -tm 2048 -s 4. This cluster won’t be used though, because you’re executing your job with

Re: Unit testing support for flink application?

2016-01-04 Thread Filipe Correia
Hi list, Here's a concrete example of an issue that I've found when trying to unit test a flink app (scroll down to see the console output): https://gist.github.com/filipefigcorreia/fdf106eb3d40e035f82a I am creating a custom datasink to collect the results, but the execution seems to finish befo

Re: Monitoring single-run job statistics

2016-01-04 Thread Filip Łęczycki
Hi Till, Thank you for you answer however I am sorry to hear that. I was reluctant to execute jobs with long running Flink cluster due to the fact that multiple jobs would cloud yarn statistics regarding cpu and memory time as well as Flink's garbage collector statistics in log, as they would be s

Questions re: ExecutionGraph & ResultPartitions for interactive use a la Spark

2016-01-04 Thread kovas boguta
I'm impressed with the Flink API, it seems simpler and more composable than what I've seen elsewhere. I'm trying to see how to achieve a more interactive, REPL-driven experience, similar to Spark. I'm consuming Flink from Clojure. For now I'm only interested in smaller clusters & interactive usag

Re: streaming state

2016-01-04 Thread Alex Rovner
Thank you Stephan for the information! On Mon, Dec 14, 2015 at 5:20 AM Stephan Ewen wrote: > Hi Alex! > > Right now, Flink would not reuse Kafka's partitioning for joins, but > shuffle/partition data by itself. Flink is very fast at shuffling and adds > very little latency on shuffles, so that i

kafka integration issue

2016-01-04 Thread Alex Rovner
Hello Flinkers! The below program produces the following error when running locally. I am building the program using maven, using 0.10.0 and running in streaming only local mode "start-local-streaming.sh". I have verified that kafka and the topic is working properly by using kafka-console-*.sh sc

Streaming in Flink

2016-01-04 Thread Sourav Mazumder
Hi, Does Flink support push based data streaming where the data source can push the events/data to Flink cluster over a socket (instead of Flink pulling the data at a given frequency)? Regards, Sourav

Re: Streaming in Flink

2016-01-04 Thread 戴資力
Hi Sourav, Flink's streaming processes incoming data by-each-entry (true streaming, as compared to micro-batch), and streaming is inherently designed as a push-model, where a topology of stream transformations "listens" to a data source. You can have a Flink streaming topology's data source confi

Re: Streaming in Flink

2016-01-04 Thread Sourav Mazumder
Hi Gordon, Need little more clarification around reading data from Kafka. As soon as any component behaves as a consumer of a topic/queue (iof a messaging system), it essentially does polling of the data after a regular interval (that interval may be small though). Hence essentially it captures a

Re: [ANNOUNCE] Introducing Apache Flink Taiwan User Group - Flink.tw

2016-01-04 Thread 戴資力
Thanks for all the support guys! Hi Liang, The co-authors of the blog mostly use traditional Chinese. I think we'll be sticking with this for a while since the majority of the group members are more familiar with traditional Chinese too :) On Mon, Jan 4, 2016 at 5:43 PM, Liang Chen wrote: > A

Re: Streaming in Flink

2016-01-04 Thread Chiwan Park
Hi Sourav, Basically, Kafka consumer is pull-based [1]. If you want to build push-based system, you should use other options. Flink supports both pull-based and push-based paradigm. It depends upon an implementation of data source. As one of examples, Flink provides a streaming source function