Re: External DB as sink - with processing guarantees

2016-03-11 Thread Nick Dimiduk
Pretty much anything you can write to from a Hadoop MapReduce program can be a Flink destination. Just plug in the OutputFormat and go. Re: output semantics, your mileage may vary. Flink should do you fine for at least once. On Friday, March 11, 2016, Josh wrote: > Hi all, > > I want to use an

External DB as sink - with processing guarantees

2016-03-11 Thread Josh
Hi all, I want to use an external data store (DynamoDB) as a sink with Flink. It looks like there's no connector for Dynamo at the moment, so I have two questions: 1. Is it easy to write my own sink for Flink and are there any docs around how to do this? 2. If I do this, will I still be able to

Re: Running Flink 1.0.0 on YARN

2016-03-11 Thread Robert Metzger
Hi, the first issue you are describing is expected. Flink is starting the web interface on the container running the JobManager, not on the resource manager. Also, the port is allocated dynamically, to avoid port collisions. So its not started on 8081. However, you can access the web interface fro

Re: Flink and YARN ship folder

2016-03-11 Thread Ufuk Celebi
Everything in the lib folder should be added to the classpath. Can you check the YARN client logs that the files are uploaded? Furthermore, you can check the classpath of the JVM in the YARN logs of the JobManager/TaskManager processes. – Ufuk On Fri, Mar 11, 2016 at 5:33 PM, Andrea Sella wrote

Re: Log4j configuration on YARN

2016-03-11 Thread Ufuk Celebi
Hey Nick! I just checked and the conf/log4j.properties file is copied and is given as an argument to the JVM. You should see the following: - client logs that the conf/log4j.properties file is copied - JobManager logs show log4j.configuration being passed to the JVM. Can you confirm that these s

Log4j configuration on YARN

2016-03-11 Thread Nick Dimiduk
Can anyone tell me where I must place my application-specific log4j.properties to have them honored when running on a YARN cluster? In my application jar doesn't work. In the log4j files under flink/conf doesn't work. My goal is to set the log level for 'com.mycompany' classes used in my flink app

Passing two value to the ConvergenceCriterion function

2016-03-11 Thread Riccardo Diomedi
Hi I want to send two value to the ConvergenceCriterion function, so i decided to use an aggregator of Tuple2. But then, when i implement Aggregator, i cannot do that because Tuple2 doesn’t implement Value. So i tried to create a class Tuple2Value that implements Value, but here i get stuck b

Flink and YARN ship folder

2016-03-11 Thread Andrea Sella
Hi, There is a way to add external dependencies to Flink Job, running on YARN, not using HADOOP_CLASSPATH? I am looking for a similar idea to standalone mode using lib folder. BR, Andrea

Kafka integration error

2016-03-11 Thread Stefanos Antaris
Hi to all, i am trying to make Flink to work with Kafka but i always have the following exception. It works perfect on my laptop but when i try to use it on the cluster it always fails. java.lang.Exception at org.apache.flink.streaming.connectors.kafka.internals.LegacyFetcher.run(Lega

Re: Stack overflow from self referencing Avro schema

2016-03-11 Thread David Kim
Thanks Stephan! :) On Thu, Mar 10, 2016 at 11:06 AM, Stephan Ewen wrote: > The following issue should track that. > https://issues.apache.org/jira/browse/FLINK-3602 > > @Niels: Thanks for looking into this. At this point, I think it may > actually be a Flink issue, since it concerns the interact

Re: kafka.javaapi.consumer.SimpleConsumer class not found

2016-03-11 Thread Balaji Rajagopalan
Robert, That did not fix it ( using flink and connector same version) . Tried with scala version 2.11, so will try to see scala 2.10 makes any difference. balaji On Fri, Mar 11, 2016 at 8:06 PM, Robert Metzger wrote: > Hi, > > you have to use the same version for all dependencies from the > "

Re: protobuf messages from Kafka to elasticsearch using flink

2016-03-11 Thread Robert Metzger
Hi, I think what you have to do is the following: 1. Create your own DeserializationSchema. There, the deserialize() method gets a byte[] for each message in Kafka 2. Deserialize the byte[] using the generated classes from protobuf. 3. If your datatype is called "Foo", there should be a generated

Re: kafka.javaapi.consumer.SimpleConsumer class not found

2016-03-11 Thread Robert Metzger
Hi, you have to use the same version for all dependencies from the "org.apache.flink" group. You said these are the versions you are using: flink.version = 0.10.2 kafka.verison = 0.8.2 flink.kafka.connection.verion=0.9.1 For the connector, you also need to use 0.10.2. On Fri, Mar 11, 2016 at

Re: Flink streaming throughput

2016-03-11 Thread Robert Metzger
Hi Hironori, can you try with the kafka-console-consumer how many messages you can read in one minute? Maybe the broker's disk I/O is limited because everything is running in virtual machines (potentially sharing one hard disk?) I'm also not sure if running a Kafka 0.8 consumer against a 0.9 broke

Re: 404 error for Flink examples

2016-03-11 Thread janardhan shetty
Good idea Ufuk. But for a newbie user web page with explanations always helps instead of github link. On Mar 11, 2016 3:30 AM, "Ufuk Celebi" wrote: > I was wondering whether we should completely remove that page and just > link to the examples package on GitHub. Do you think that it would be > a

Re: DataSet -> DataStream

2016-03-11 Thread Stephan Ewen
Hi! It should be quite straightforward to write an "OutputFormat" that wraps the "FlinkKafkaProducer". That way you can write to Kafka from a DataSet program. Stephan On Fri, Mar 11, 2016 at 1:46 PM, Prez Cannady wrote: > This is roughly the solution I have now. On the other hand, I was ho

Re: DataSet -> DataStream

2016-03-11 Thread Prez Cannady
This is roughly the solution I have now. On the other hand, I was hoping for a solution that doesn’t involve checking whether a file has updated. Prez Cannady p: 617 500 3378 e: revp...@opencorrelate.org GH: https://github.com/opencorrelate

Re: 404 error for Flink examples

2016-03-11 Thread Ufuk Celebi
I was wondering whether we should completely remove that page and just link to the examples package on GitHub. Do you think that it would be a good idea? On Fri, Mar 11, 2016 at 10:45 AM, Maximilian Michels wrote: > Thanks for noticing, Janardhan. Fixed for the next release. > > On Fri, Mar 11, 2

Re: 404 error for Flink examples

2016-03-11 Thread Maximilian Michels
Thanks for noticing, Janardhan. Fixed for the next release. On Fri, Mar 11, 2016 at 6:38 AM, janardhan shetty wrote: > Thanks Balaji. > > This needs to be updated in the Job.java file of quickstart application. > I am using 1.0 version > > On Thu, Mar 10, 2016 at 9:23 PM, Balaji Rajagopalan > wr

kafka.javaapi.consumer.SimpleConsumer class not found

2016-03-11 Thread Balaji Rajagopalan
I am tyring to use the flink kafka connector, for this I have specified the kafka connector dependency and created a fat jar since default flink installation does not contain kafka connector jars. I have made sure that flink-streaming-demo-0.1.jar has the kafka.javaapi.consumer.SimpleConsumer.class

Re: Flink streaming throughput

2016-03-11 Thread おぎばやしひろのり
Aljoscha, Thank you for your response. I tried no JSON parsing and no sink (DiscardingSink) case. The throughput was 8228msg/sec. Slightly better than JSON + Elasticsearch case. I also tried using socketTextStream instead of FlinkKafkaConsumer, in that case, the result was 60,000 msg/sec with jus