Re: Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

2019-06-14 Thread Fabian Hueske
Hi Visha, If I remember correctly, the behavior of the Kafka consumer was changed in Flink 1.8 to account for such situations. Please check the release notes [1] and the corresponding Jira issue [2]. If this is not the behavior you need, please feel free to create a new Jira issue and start a dis

How to join/group 2 streams by key?

2019-06-14 Thread John Tipper
Hi All, I have 2 streams of events that relate to a common base event, where one stream is the result of a flatmap. I want to join all events that share a common identifier. Thus I have something that looks like: DataStream streamA = ... DataStream streamB = someDataStream.flatMap(...) // pro

Flink sent/received bytes related metrics: how are they calculated?

2019-06-14 Thread Andrea Spina
Dear Community, I'd like to ask for some details about bytes related metrics in Flink. Precisely, I'm looking at *bytes sent* and *bytes received *metrics: what I am recording is the following: I am reading from a Kafka topic *A* records with schema *K* using a source *Sin *belonging to a pipeline

Error while using session window

2019-06-14 Thread Abhishek Jain
Hi, I have a job that uses processing time session window with inactivity gap of 60ms where I intermittently run into the following exception. I'm trying to figure out what happened here. Haven't been able to reproduce this scenario. Any thoughts? java.lang.UnsupportedOperationException: *The end

Re: Flink sent/received bytes related metrics: how are they calculated?

2019-06-14 Thread Chesnay Schepler
Which version of Flink are you using? There were some issues at some point about double-counting. On 14/06/2019 09:49, Andrea Spina wrote: Dear Community, I'd like to ask for some details about bytes related metrics in Flink. Precisely, I'm looking at *bytes sent* and *bytes received *metrics:

Re: Re: How can i just implement a crontab function using flink?

2019-06-14 Thread wangl...@geekplus.com.cn
I tried。 But the MyProcessWindowFunction still not tigged when there's no event in the window Any insight on this? source.assignTimestampsAndWatermarks(new AssignerWithPeriodicWatermarks() { @Override public Watermark getCurrentWatermark() { return new Watermark(System.curren

How to trigger the window function even there's no message input in this window?

2019-06-14 Thread wangl...@geekplus.com.cn
windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10))).process(new MyProcessWindowFunction());How can i trigger the MyProcessWindowFunction even there's no input during this window time? wangl...@geekplus.com.cn

Re: Flink sent/received bytes related metrics: how are they calculated?

2019-06-14 Thread Andrea Spina
Sorry, I totally missed the version: flink-1.6.4, Streaming API Il giorno ven 14 giu 2019 alle ore 11:08 Chesnay Schepler < ches...@apache.org> ha scritto: > Which version of Flink are you using? There were some issues at some point > about double-counting. > > On 14/06/2019 09:49, Andrea Spina w

Re: [DISCUSS] Deprecate previous Python APIs

2019-06-14 Thread Stephan Ewen
Okay, so we seem to have consensus for at least deprecating them, with a suggestion to even directly remove them. A previous survey also brought no users of that python API to light [1] I am inclined to go with removing. Typically, deprecation is the way to go, but we could make an exception and e

Re: Flink sent/received bytes related metrics: how are they calculated?

2019-06-14 Thread Chesnay Schepler
How does the *P1 *pipeline look like? Are there 2 downstream operators reading from *Sin* (in this case the number of bytes would be measured twice)? On 14/06/2019 12:09, Andrea Spina wrote: Sorry, I totally missed the version: flink-1.6.4, Streaming API Il giorno ven 14 giu 2019 alle ore 11:

Re: [DISCUSS] Deprecate previous Python APIs

2019-06-14 Thread Yu Li
+1 on removing plus an explicit NOTE thread, to prevent any neglection due to the current title (deprecation). Best Regards, Yu On Fri, 14 Jun 2019 at 18:09, Stephan Ewen wrote: > Okay, so we seem to have consensus for at least deprecating them, with a > suggestion to even directly remove them

Re: Flink sent/received bytes related metrics: how are they calculated?

2019-06-14 Thread Andrea Spina
Hi Chesnay, just one downstream: Sin (Source: Enriched Code) outcome is the right part of the following operator as in the figure; this operator is the exclusive downstream of Sin. Thanks, [image: Screenshot 2019-06-14 at 13.52.52.png] Il giorno ven 14 giu 2019 alle ore 12:23 Chesnay Schepler < c

Re: [DISCUSS] Deprecate previous Python APIs

2019-06-14 Thread jincheng sun
+1 for removing and we can try our best to enrich the new Python API. Cheers, Jincheng Yu Li 于2019年6月14日周五 下午6:42写道: > +1 on removing plus an explicit NOTE thread, to prevent any neglection due > to the current title (deprecation). > > Best Regards, > Yu > > > On Fri, 14 Jun 2019 at 18:09, Step

Re: [Flink 1.6.1] _metadata file in retained checkpoint

2019-06-14 Thread Rinat
Hi Vasyl, thx for your reply, I’ll check > On 10 Jun 2019, at 14:22, Vasyl Bervetskyi wrote: > > Hi Rinat, > > Savepoint need to be triggered when you want to create point in time which > you want to use in future to revert back your state, also you could cancel > job with savepoint which ma

privacy preserving on streaming Kmeans in Flink

2019-06-14 Thread alaa
*I try to make a research in privacy preserving data on streaming k-means clustering and to develop an implementation on Flink. My questions are : If there a library for K means in Flink? Because I didn't see it on FlinkML? How we can I implement streaming kmeans clustering on Flink? If

Re: How to restart/recover on reboot?

2019-06-14 Thread John Smith
I looked into the start-cluster.sh and I don't see anything special. So technically it should be as easy as installing Systemd services to run jobamanger.sh and taskmanager.sh respectively? On Wed, 12 Jun 2019 at 13:02, John Smith wrote: > The installation instructions do not indicate how to cre

Re: Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

2019-06-14 Thread Vishal Santoshi
Yep, but "Consider this example: if you had a Kafka Consumer that was consuming from topic A, you did a savepoint, then changed your Kafka consumer to instead consume from topic B, and then restarted your job from the savepoint. Before this change, your consumer would now consume from both topic A

Re: Deletion of topic from a pattern in KafkaConsumer if auto-create is true will always recreate the topic

2019-06-14 Thread Vishal Santoshi
I am sure there was a ticket open that allowed for clean manipulation of state ( that would have saved us a whole lot ).. On Fri, Jun 14, 2019 at 1:19 PM Vishal Santoshi wrote: > Yep, but > > "Consider this example: if you had a Kafka Consumer that was consuming > from topic A, you did a sav

Apache Flink - Question about metric registry and reporter and context information

2019-06-14 Thread M Singh
Hi: I wanted to find if the metric reporter and registry are instantiated per task manager (which is a single JVM process) or per slot.  I believe it per task manager (JVM process) but just wanted to confirm. Also, is there a way to access context information (eg: task manager name etc) in the m

Re: privacy preserving on streaming Kmeans in Flink

2019-06-14 Thread Ken Krugler
Hi Alaa, You could look at https://github.com/ScaleUnlimited/flink-streaming-kmeans for an example of this. Though note that there are non-iterative versions to k-means clustering that are much more efficient. This code was a way of e

Apache Flink Sql - How to access EXPR$0, EXPR$1 values from a Row in a table

2019-06-14 Thread M Singh
Hi: I am working with Flink Sql and have a table with the following schema: root |-- name: String |-- idx: Integer |-- pos: String |-- tx: Row(EXPR$0: Integer, EXPR$1: String) How can I access the attributes tx.EXPR$0 and tx.EXPR$1 ? I tried the following (the table is registered as 'tupleTable')