Re: Kinesis connector classpath issue when running Flink 1.1-SNAPSHOT on YARN

2016-06-16 Thread Tai Gordon
Hi Josh, I’m looking into the problem. Seems like the connector is somehow using older versions of httpclient. Can you print the loaded class path at runtime, and check the path & version of the loaded httpclient / httpcore dependency? i.e. `classOf[HttpConnectionParams].getResource("HttpConnectio

Re: Moving from single-node, Maven examples to cluster execution

2016-06-16 Thread Prez Cannady
All right, I figured I’d have to do shading, but hadn’t gotten around to experimenting. I’ll try it out. Prez Cannady p: 617 500 3378 e: revp...@opencorrelate.org GH: https://github.com/opencorrelate LI: https://www.li

Re: Moving from single-node, Maven examples to cluster execution

2016-06-16 Thread Josh
Hi Prez, You need to build a jar with all your dependencies bundled inside. With maven you can use maven-assembly-plugin for this, or with SBT there's sbt-assembly. Once you've done this, you can login to the JobManager node of your Flink cluster, copy the jar across and use the Flink command lin

Moving from single-node, Maven examples to cluster execution

2016-06-16 Thread Prez Cannady
Having a hard time trying to get my head around how to deploy my Flink programs to a pre-configured, remote Flink cluster setup. My Mavenized setup uses Spring Boot (to simplify class path handling and generate pretty logs) to execute provision a StreamExecutionEnvironment with Kafka sources an

Re: Checkpoint takes long with FlinkKafkaConsumer

2016-06-16 Thread Ufuk Celebi
Thanks :) On Thu, Jun 16, 2016 at 3:21 PM, Hironori Ogibayashi wrote: > Ufuk, > > Yes, of course. I will be sure to update when I got some more information. > > Hironori > > 2016-06-16 1:56 GMT+09:00 Ufuk Celebi : >> Hey Hironori, >> >> thanks for reporting this. Could you please update this thre

Documentation for translation of Job graph to Execution graph

2016-06-16 Thread Bajaj, Abhinav
Hi, When troubleshooting a flink job, it is tricky to map the Job graph (application code) to the logs & monitoring REST APIs. So, I am trying to find documentation on how a Job graph is translated to Execution graph. I found this - https://ci.apache.org/projects/flink/flink-docs-release-1.0/i

Kinesis connector classpath issue when running Flink 1.1-SNAPSHOT on YARN

2016-06-16 Thread Josh
Hey, I've been running the Kinesis connector successfully now for a couple of weeks, on a Flink cluster running Flink 1.0.3 on EMR 2.7.1/YARN. Today I've been trying to get it working on a cluster running the current Flink master (1.1-SNAPSHOT) but am running into a classpath issue when starting

Re: Yarn batch not working with standalone yarn job manager once a persistent, HA job manager is launched ?

2016-06-16 Thread Till Rohrmann
Hi Arnaud, at the moment the environment variable is the only way to specify a different config directory for the CLIFrontend. But it totally makes sense to introduce a --configDir parameter for the flink shell script. I'll open an issue for this. Cheers, Till On Thu, Jun 16, 2016 at 5:36 PM, LI

Re: Elasticsearch Connector

2016-06-16 Thread Till Rohrmann
Hi Eamon, in order to use the snapshot binaries you have to add the snapshot repository to your pom.xml: apache.snapshots Apache Development Snapshot Repository https://repository.apache.org/content/repositories/snapshots/ false true Cheers, Till

Re: How MapFunction gets executed?

2016-06-16 Thread Till Rohrmann
Hi Yan Chou Chen, Flink does not instantiate for each record a mapper. Instead, it will create as many mappers as you've defined with the parallelism. Each of these mappers is deployed to a slot on a TaskManager. When it is deployed and before it receives records, the open method is called once. T

RE: Yarn batch not working with standalone yarn job manager once a persistent, HA job manager is launched ?

2016-06-16 Thread LINZ, Arnaud
Okay, is there a way to specify the flink-conf.yaml to use on the ./bin/flink command-line? I see no such option. I guess I have to set FLINK_CONF_DIR before the call ? -Message d'origine- De : Maximilian Michels [mailto:m...@apache.org] Envoyé : mercredi 15 juin 2016 18:06 À : user@fli

Elasticsearch Connector

2016-06-16 Thread Eamon Kavanagh
Hey Support, I'm trying to use Flink's Elasticsearch connector but I'm having trouble. When I add the dependency seen here ( https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/connectors/elasticsearch2.html) to my pom file, IntelliJ can't find it. I also can't find it on the ma

Re: dataset dataframe join

2016-06-16 Thread Vishnu Viswanath
Thank you Till, On Thu, Jun 16, 2016 at 10:08 AM, Till Rohrmann wrote: > Hi Vishnu, > > currently the only way to do this, is to persist the DataSet (e.g. writing > to a file) and then reading from the persisted form (e.g. file) in the open > method of a rich function in the DataStream program.

How MapFunction gets executed?

2016-06-16 Thread Yan Chou Chen
A quick question. When running a stream job that executes DataStream.map(MapFunction) , after data is read from Kafka, does each MapFunction is created per item or based on parallelism? For instance, for the following code snippet val env = StreamExecutionEnvironment.getExeutionEnvironment val st

Re: dataset dataframe join

2016-06-16 Thread Till Rohrmann
Hi Vishnu, currently the only way to do this, is to persist the DataSet (e.g. writing to a file) and then reading from the persisted form (e.g. file) in the open method of a rich function in the DataStream program. That way you can keep the data in your operator and then join with incoming stream

Re: Checkpoint takes long with FlinkKafkaConsumer

2016-06-16 Thread Hironori Ogibayashi
Ufuk, Yes, of course. I will be sure to update when I got some more information. Hironori 2016-06-16 1:56 GMT+09:00 Ufuk Celebi : > Hey Hironori, > > thanks for reporting this. Could you please update this thread when > you have more information from the Kafka list? > > – Ufuk > > > On Wed, Jun