Re: Error during Kafka connection

2017-08-13 Thread Tzu-Li (Gordon) Tai
Hi, I don’t have experience running Kafka clusters behind proxies, but it seems like the configurations “advertised.host.name” and “advertised.port” for your Kafka brokers are what you’re looking for. For information on that please refer to the Kafka documentations. Cheers, Gordon On 12 Augus

Re: Flink Data Streaming to S3

2017-08-13 Thread vinay patil
Hi, Yes, I am able to write to S3 using DataStream API. I have answered you the approach on SO Regards, Vinay Patil On Mon, Aug 14, 2017 at 4:21 AM, ant burton [via Apache Flink User Mailing List archive.] wrote: > Hello, > > Has anybody been able to write to S3 when using the data streaming

Re: kerberos yarn - failure in long running streaming application

2017-08-13 Thread Tzu-Li (Gordon) Tai
Hi, At first glance it seems odd, since keytabs would not expire unless on principal password expiration / changes. Was the principal’s password set for expiration, or changed? The keytab would also expire in that case. Cheers, Gordon On 14 August 2017 at 2:15:40 PM, Prabhu V (vpra...@gmail.c

kerberos yarn - failure in long running streaming application

2017-08-13 Thread Prabhu V
Hi, I am running Flink-1.3.2 on yarn (Cloudera 2.6.0-cdh5.7.6). The application stream data from kafka, groups by key, creates a session window and writes to HDFS using a rich window function in the "window.apply" method. The rich window function creates the sequence file thus SequenceFile.creat

Question about Global Windows.

2017-08-13 Thread Steve Jerman
Hi Folks, I have a question regarding Global Windows. I have a stream with a large number of records. The records have a key which has a very high cardinality. They also have a state ( start, status, finish). I need to do some processing where I look at the records separated into windows using

Distribute crawling of a URL list using Flink

2017-08-13 Thread Eranga Heshan
Hi all, I am fairly new to Flink. I have this project where I have a list of URLs (In one node) which need to be crawled distributedly. Then for each URL, I need the serialized crawled result to be written to a single text file. I want to know if there are similar projects which I can look into o

Re:IllegalArgumentException when using elasticsearch as a sink

2017-08-13 Thread mingleizhang
BTW, My elastic search version is 2.3.3, not the jira FLINK-7133 by 1.7.1. And I found 2.3.3 is not based on asm. My flink version is 1.3.1. flink-connector-elasticsearch-base_2.10 version is 1.3.1 flink-connector-elasticsearch2_2.10 version is 1.3.1 also. At 2017-08-13 21:54:06, "minglei

Flink Data Streaming to S3

2017-08-13 Thread ant burton
Hello, Has anybody been able to write to S3 when using the data streaming API ? I’m having this problem https://stackoverflow.com/questions/45655850/flink-s3-write-fails-unable-to-load-aws-credentials-from-any-provider-in-the-cha

Re: Aggregation by key hierarchy

2017-08-13 Thread Basanth Gowda
For example - this is a sample model from one of the Apache Apex presentation. I would want to aggregate for different combinations, and different time buckets. What is the best way to do this in Flink ? {"keys":[{"name":"campaignId","type":"integer"}, {"name":"adId","type":"integer"}, {"name":

Aggregation by key hierarchy

2017-08-13 Thread Basanth Gowda
Hi, I want to aggregate hits by Country, State, City. I would these as tags in my sample data. How would I do aggregation at different levels ? Input data would be single record Should I do flatMap transformation first and create 3 records from 1 input record, or is there a better way to do it ?

Re: TypeInformation in Custom Deserializer

2017-08-13 Thread AndreaKinn
Thank you, I solved in this way Greg Hogan wrote > You should be able to implement this using a TypeHint (see the Creating a > TypeInformation or TypeSerializer section from the linked page): > > return TypeInformation.of(new TypeHint Date, > String, String, Doubl

Re: Writing on Cassandra

2017-08-13 Thread AndreaKinn
Ok, this is my situation: I have a stream of Tuple2> the cassandra code: CassandraSink.addSink(stream) .setQuery("INSERT INTO keyspace_local.values_by_sensors_users" + " (user, sensor, timestamp, json_ld, observed_value, value)" + " VALUES (?, ?,

Re: TypeInformation in Custom Deserializer

2017-08-13 Thread Greg Hogan
You should be able to implement this using a TypeHint (see the Creating a TypeInformation or TypeSerializer section from the linked page): return TypeInformation.of(new TypeHint>(){}); > On Aug 13, 2017, at 10:31 AM, AndreaKinn wrote: > > Hi, > I'm trying to implement a custom deseria

Re: PageRank iteration

2017-08-13 Thread Greg Hogan
PageRank is using a bulk iteration via DataSet#iterate whereas a delta iteration would start with DataSet#iterateDelta. > On Aug 13, 2017, at 10:30 AM, Kaepke, Marc wrote: > > Hi everyone, > > does PageRank use bulk or delta iteration? > > I mean the implementation of PageRank of the package

Re: TypeInformation in Custom Deserializer

2017-08-13 Thread Ted Yu
Please take a look at the following in flink-java/src/main/java/org/apache/flink/api/java/hadoop/mapred/HadoopInputFormat.java public TypeInformation> getProducedType() { return new TupleTypeInfo<>(TypeExtractor.createTypeInfo(keyClass), TypeExtractor.createTypeInfo(valueClass)); FYI On S

Re: TypeInformation in Custom Deserializer

2017-08-13 Thread AndreaKinn
But I'm using Java primitive type like String, Double plus Date types. Flink doesn't know how to handle them? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/TypeInformation-in-Custom-Deserializer-tp14861p14863.html Sent from the Apache Flink

Re: TypeInformation in Custom Deserializer

2017-08-13 Thread Ted Yu
>From ResultTypeQueryable : * Gets the data type (as a {@link TypeInformation}) produced by this function or input format. * * @return The data type produced by this function or input format. */ TypeInformation getProducedType(); You can look at classes which implement this method a

TypeInformation in Custom Deserializer

2017-08-13 Thread AndreaKinn
Hi, I'm trying to implement a custom deserialiser to deserialise data from a kafka sink. So I'm implementing a KeyedDeserializedSchema> which ask me to override the method: @Override public TypeInformation> getProducedType() { //to do } Honestly I investigated in link

PageRank iteration

2017-08-13 Thread Kaepke, Marc
Hi everyone, does PageRank use bulk or delta iteration? I mean the implementation of PageRank of the package: package org.apache.flink.graph.library.link_analysis; Thanks. Best, Marc

Re: Apache beam and Flink

2017-08-13 Thread Ted Yu
I found this: http://search-hadoop.com/m/Beam/gfKHFfzU1k2VKWlv?subj=Re+CEP+Pattern+matching+on+top+of+Beam+pipeline Consider asking on Beam mailing list. On Sun, Aug 13, 2017 at 6:53 AM, Basanth Gowda wrote: > I wasn't able to find much info on Flink and Beam wrt CEP & Graph > functionalities.

IllegalArgumentException when using elasticsearch as a sink

2017-08-13 Thread mingleizhang
Hello, flink experts and friends! It is my first time to write flink application in my company. But I met the following error when I used a elasticsearch as my sink. I searched the solution for it and found a jira https://issues.apache.org/jira/browse/FLINK-7133 . then, I added the PR to my co

Apache beam and Flink

2017-08-13 Thread Basanth Gowda
I wasn't able to find much info on Flink and Beam wrt CEP & Graph functionalities. Is it supported with Apache Beam ? Is it possible to mix and match Flink and Beam in a single use case ? thank you, Basanth