Running on AWS/EMR/Yarn - where is the WebUI?

2016-08-15 Thread Jon Yeargers
Working with a 3 node cluster. Started via YARN. If I go to port 8080 I see the Tomcat start screen. 8088 has the Yarn screen. Didn't see anything obvious to start the UI in the bin folder.

Performance issues - is my topology not setup properly?

2016-08-15 Thread Jon Yeargers
Flink 1.1.1 is running on AWS / EMR. 3 boxes - total 24 cores and 90Gb of RAM. Job is submitted via yarn. Topology: read csv files from SQS -> parse files by line and create object for each line -> pass through 'KeySelector' to pair entries (by hash) over 60 second window -> write original and

Output of KeyBy->TimeWindow->Reduce?

2016-08-07 Thread Jon Yeargers
Take the above topology and send the resultant DataStream to .print(). Assume the Reduce function changes a character to it's uppercase equivalent. Assume this whole stream makes it within the span of a single Window. Send it a stream like a,b,c,d,e,f,g,h,c,e,g. What will be output? a) a,b,C,

What is output from DataSet.print()?

2016-08-02 Thread Jon Yeargers
Topology snip: datastream = some_stream.keyBy(keySelector).timeWindow(Time.seconds(60)).reduce(new some_KeyReduce()); If I have a KeySelector that's pretty 'loose' (IE lots of matches) the 'some_KeyReduce' function gets hit frequently and some set of values is printed out via 'datastream.print(