Re: Service discovery for flink-metrics-prometheus

2018-01-05 Thread Kien Truong
Hi, We are using YARN for deployment, so the combination of host&port for the Prometheus reporters can be really random depending on how the containers are co-located.  One option we thought of was scrapping the log for this information, but it can be really messy in the long run. Regards, Ki

Dynamically get schema from element to pass to AvroParquetWriter

2018-01-05 Thread Kyle Hamlin
I implemented an Avro to Parquet writer which previously took an Avro schema in as a string to the constructor and passed it to the AvroParquetWriter. Now I'm wondering if there is a way to get the schema from the element and pass to the AvroParquetWriter. I tried grabbing the schema from the eleme

Re: Can't call getProducedType on Avro messages with array types

2018-01-05 Thread Aljoscha Krettek
Yes, there is some magic in the KryoSerializer and other serialisers that detect whether the flink-avro dependency is there and then use special TypeSerializers from there. (Specifically, this is AvroUtils which has a default implementation that doesn't do much and a special implementation call

Re: BucketingSink doesn't work anymore moving from 1.3.2 to 1.4.0

2018-01-05 Thread Kyle Hamlin
Also, I'm not using hdfs I'm trying to sink to s3. On Fri, Jan 5, 2018 at 6:18 PM Kyle Hamlin wrote: > I have the hadoop-common.jar in my build.sbt because I was having issues > compiling my jar after moving from 1.3.2 to 1.4.0 because > org.apache.hadoop.fs.{FileSystem, Path} were no longer in

Re: Queryable State in Flink 1.4

2018-01-05 Thread Boris Lublinsky
Thanks This was it. It would help to have this in documentation along with `flink-queryable-state-client` Boris Lublinsky FDP Architect boris.lublin...@lightbend.com https://www.lightbend.com/ > On Jan 5, 2018, at 11:46 AM, Till Rohrmann wrote: > > Did you add the `flink-queryable-state-runti

Re: BucketingSink doesn't work anymore moving from 1.3.2 to 1.4.0

2018-01-05 Thread Kyle Hamlin
I have the hadoop-common.jar in my build.sbt because I was having issues compiling my jar after moving from 1.3.2 to 1.4.0 because org.apache.hadoop.fs.{FileSystem, Path} were no longer in Flink and I use them in my custom bucketer and to writer to write Avro out to Parquet. I tried adding classlo

Re: Can't call getProducedType on Avro messages with array types

2018-01-05 Thread Kyle Hamlin
So I just added the dependency but didn't change the getProducedType method and it worked fine. Would you expect that to be the case? On Fri, Jan 5, 2018 at 5:43 PM Aljoscha Krettek wrote: > Yes, that should do the trick. > > > On 5. Jan 2018, at 18:37, Kyle Hamlin wrote: > > I can add that dep

Re: "keyed" aggregation

2018-01-05 Thread Till Rohrmann
Hi Christophe, if you don't have a way to recompute the key from the aggregation result, then you have to write an aggregation function which explicitly keeps it (e.g. a tuple value where the first entry is the key and the second the aggregate value). Cheers, Till On Fri, Jan 5, 2018 at 5:51 PM,

Re: Queryable State in Flink 1.4

2018-01-05 Thread Till Rohrmann
Did you add the `flink-queryable-state-runtime` jar as a dependency to your project? You can check the log whether a queryable state proxy and server have been started. Cheers, Till On Fri, Jan 5, 2018 at 5:19 PM, Boris Lublinsky < boris.lublin...@lightbend.com> wrote: > I also tried to comment

Re: Can't call getProducedType on Avro messages with array types

2018-01-05 Thread Aljoscha Krettek
Yes, that should do the trick. > On 5. Jan 2018, at 18:37, Kyle Hamlin wrote: > > I can add that dependency. So I would replace > > override def getProducedType: TypeInformation[T] = { > TypeExtractor.getForClass(implicitly[ClassTag[T]].runtimeClass.asInstanceOf[Class[T]]) > } > > with somethi

Re: Can't call getProducedType on Avro messages with array types

2018-01-05 Thread Kyle Hamlin
I can add that dependency. So I would replace override def getProducedType: TypeInformation[T] = { TypeExtractor.getForClass(implicitly[ClassTag[T]].runtimeClass.asInstanceOf[Class[T]]) } with something like: override def getProducedType: TypeInformation[T] = { new AvroTypeInfo(classOf[T]) }

Re: A question about Triggers

2018-01-05 Thread Fabian Hueske
Hi, you would not need the ListStateDescriptor. A WindowProcessFunction stores all events that are assigned to a window (IN objects in your case) in an internal ListState. The Iterable parameter of the process() method iterates over the internal list state. So you would have a Trigger that fires

"keyed" aggregation

2018-01-05 Thread Christophe Jolif
Hi all, I'm sourcing from a Kafka topic, using the key of the Kafka message to key the stream, then doing some aggregation on the keyed stream. Now I want to sink back to a different Kafka topic but re-using the same key. The thing is that my aggregation "lost" the key. Obviously I can make sure

Re: A question about Triggers

2018-01-05 Thread Vishal Santoshi
Hello Fabian, Thank you for your response. I thought about it and may be am missing something obvious here. The code below is what I think you suggest. The issue is that the window now is a list of Session's ( or shall subsets of the Session). What is required is that on a ne

Re: Queryable State in Flink 1.4

2018-01-05 Thread Boris Lublinsky
I also tried to comment out //config.setInteger(ConfigConstants.LOCAL_NUMBER_TASK_MANAGER, 2); Still no luck. Do you guys have a working example for queryable state for 1.4 somewhere? Boris Lublinsky FDP Architect boris.lublin...@lightbend.com https://www.lightbend.com/ > On Jan 5, 2018, at

Re: about the checkpoint and state backend

2018-01-05 Thread Aljoscha Krettek
If checkpointing is disabled RocksDB will only store state locally on disk but it will not be checkpointed to a DFS. This means that in case of failure you lose state. > On 5. Jan 2018, at 14:38, Jinhua Luo wrote: > > Thanks, and I will read the codes to get more understanding. > > Let me rep

Re: Queryable State in Flink 1.4

2018-01-05 Thread Boris Lublinsky
Thanks Till, I am probably slow. I changed the code to the following: // In a non MiniCluster setup queryable state is enabled by default. config.setString(QueryableStateOptions.PROXY_PORT_RANGE, "50100-50101") config.setInteger(QueryableStateOptions.PROXY_NETWORK_THREADS, 2) config.setInteger(Que

Re: about the checkpoint and state backend

2018-01-05 Thread Jinhua Luo
Thanks, and I will read the codes to get more understanding. Let me repeat another question, what happen if the checkpoing is disabled (by default, as known)? Does the state still saved? 2018-01-04 22:48 GMT+08:00 Aljoscha Krettek : > TaskManagers don't do any checkpointing but Operators that run

Re: Flink and Rest API

2018-01-05 Thread Till Rohrmann
Hi Alberto, currently, the queryable state is not exposed via a REST interface. You have to use the QueryableStateClient for that [1]. If it's not possible to directly connect to the machines running your Flink cluster, then you can also expose the values via the metric system as Sendoh proposed.

Re: A question about Triggers

2018-01-05 Thread Fabian Hueske
Hi Vishal, thanks for sharing your solution! Looking at this issue again and your mail in which you shared your SessionProcessWindow ProcessWindowFunction, I'm wondering why you need the ValueState that prevents the ProcessWindowFunction to be used in a mergeable window. You could have created a

Re: Queryable State in Flink 1.4

2018-01-05 Thread Till Rohrmann
Hi Boris, if you start 2 TaskManagers on the same host, then you have to define a port range for the KvState server and the proxy. Otherwise the Flink cluster should not be able to start. Cheers, Till On Thu, Jan 4, 2018 at 11:19 PM, Boris Lublinsky < boris.lublin...@lightbend.com> wrote: > It

Re: Parallelizing a tumbling group window

2018-01-05 Thread Fabian Hueske
Hi Colin, There are two things that come to my mind: 1) You mentioned "suspect jobs are grouping by a field of constant values". Does that mean that the grouping key is always constant? Flink parallelizes the window computation per key, i.e., there is one thread per key. Although it would be poss

Re: Flink and Rest API

2018-01-05 Thread Sendoh
I think the first requirement is possible by using accumulator or metric, or? Cheers, Sendoh -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: JobManager not receiving resource offers from Mesos

2018-01-05 Thread Dongwon Kim
Hi Till, Currently I'm doing as you said for the purpose of testing. So that's not a big deal at this moment. But I hope it will be supported in Flink sooner or later as we're going to adopt Flink on a very large cluster in which GPU resources are very scarce. Anyway thank you for your attenti