Qustion about Flink Upsert Dynamic Kafka Table unlimited expansion

2021-05-06 Thread vtygoss
Hi Community, Recently i am working on building realtime data warehouse at medical field. Using Flink and Upsert-Kafka Dynamic Table, but the historical data must not be expired and the changelog stream in kafka is unlimited expanding, i have met a problem with unlimited expanding data scale.

Re: Question regarding cpu limit config in Flink standalone mode

2021-05-06 Thread Xintong Song
Hi Fan, For a java application, you cannot specify how many cpu a process should use. The JVM process will always try to use as much cpu time as it needs. The limitation can only come from external: hardware limit, OS scheduling, cgroups, etc. On Kubernetes, it is the pod's resource specification

Question regarding cpu limit config in Flink standalone mode

2021-05-06 Thread Fan Xie
Hi Flink Community, Recently I am working on an auto-scaling project that needs to dynamically adjust the cpu config of Flink standalone jobs . Our jobs will be running on standalone mode in a k8s cluster. After going through the configuration doc: https://ci.apache.org/projects/flink/flink-doc

Re: Table name for table created fromDataStream

2021-05-06 Thread Leonard Xu
Hi, tbud You can register the Table API object as a temporary view and then run query on it: tableEnv.createTemporaryView(“MyTable”, eventsTable); tableEnv.executeSql(“SELECT * FROM MyTable“).print(); Best, Leonard > 在 2021年5月7日,03:17,tbud 写道: > > Does anybody know how to set the name for th

The problem of getting data by Rest API

2021-05-06 Thread penguin.
On the Web UI page, we can see that the relevant data is updated every 3S, such as the read-bytes of each operator. But when I get data through Rest API, the data is updated every 6 seconds or even more than 10 seconds. Why? The related data of read bytes obtained through Rest API is as follows:

Table name for table created fromDataStream

2021-05-06 Thread tbud
Does anybody know how to set the name for the table created using fromDataStream() method ? Flink's documentation doesn't mention anything about this and when I went through the taskManager logs I saw some auto generated name like 'Unregistered_DataStream_5'. Here's my code : /StreamTableEnvironmen

Re: Session Windows - not working as expected

2021-05-06 Thread Swagat Mishra
I am able to maintain a list state in process function and aggregate the values, how do i get a notification/event to remove the value from the stored list state. On Thu, May 6, 2021 at 8:47 PM Swagat Mishra wrote: > I meant "Do you recommend the state to be maintained in* Value** State *or > ex

Re: Flink auto-scaling feature and documentation suggestions

2021-05-06 Thread vishalovercome
Thank you for answering all my questions. My suggestion would be to start off with exposing an API to allow dynamically changing operator parallelism as the users of flink will be better able to decide the right scaling policy. Once this functionality is there, its just a matter of providing polici

Re: Unsubscribe

2021-05-06 Thread Dan Pettersson
I've also tried a few times now the last couple of months. I think it would be very nice if the "flink admin" could look into this, instead of us reaching out to the Apache Infrastructure team. Thanks, /Dan Den tors 6 maj 2021 kl 13:31 skrev Chesnay Schepler : > Could you reach out to the Apach

Re: Flink auto-scaling feature and documentation suggestions

2021-05-06 Thread vishalovercome
We do exactly what you mentioned. However, it's not that simple unfortunately. Our services don't have a predictable performance as traffic varies a lot during the day. As I've explained above increase source parallelism to 2 was enough to tip over our services and reducing parallelism of the asy

Re: Flink auto-scaling feature and documentation suggestions

2021-05-06 Thread vishalovercome
I am using the async IO operator. The problem is that increasing source parallelism from 1 to 2 was enough to tip our systems over the edge. Reducing the parallelism of async IO operator to 2 is not an option as that would reduce the throughput quite a bit. This means that no matter what we do, we'

Re: Session Windows - not working as expected

2021-05-06 Thread Swagat Mishra
I meant "Do you recommend the state to be maintained in* Value** State *or external store like elastic?" On Thu, May 6, 2021 at 8:46 PM Swagat Mishra wrote: > I want to aggregate the user activity e.g number of products the user has > purchased in the last 1 hour. > > so - User A (ID = USER-A)

Re: Session Windows - not working as expected

2021-05-06 Thread Swagat Mishra
I want to aggregate the user activity e.g number of products the user has purchased in the last 1 hour. so - User A (ID = USER-A) purchases a1 product at 10:30 and another product at 10:45 AM and another product at 1:30 AM. My API should give 2 products purchased if the API call happens at 11:29

Re: Flink auto-scaling feature and documentation suggestions

2021-05-06 Thread Till Rohrmann
Yes, exposing an API to adjust the parallelism of individual operators is definitely a good step towards the auto-scaling feature which we will consider. The missing piece is persisting this information so that in case of recovery you don't recover with a completely different parallelism. I also a

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-06 Thread Robert Cullen
I resolved this by changing the jobmanager-rest-service.yaml (Changed type to ClusterIP and removed nodePort apiVersion: v1 kind: Service metadata: name: flink-jobmanager-rest spec: type: ClusterIP ports: - name: rest port: 8081 targetPort: 8081 #nodePort: 30081 selector:

Re: Session Windows - not working as expected

2021-05-06 Thread Arvid Heise
I'm not sure what you want to achieve exactly. You can always keyby the values by a constant pseudo-key such that all values will be in the same partition (so instead of using global but with the same effect). Then you can use a process function to maintain the state. Just make sure that your data

Re: Unsubscribe

2021-05-06 Thread Chesnay Schepler
Could you reach out to the Apache Infrastructure team about not being able to unsubscribe? Maybe this functionality is currently broken. On 5/6/2021 12:48 PM, Andrew Kramer wrote: I have been unable to unsubscribe as well. Have tried emailing just

Re: Unsubscribe

2021-05-06 Thread Andrew Kramer
I have been unable to unsubscribe as well. Have tried emailing just like you On Thu, May 6, 2021 at 3:33 AM Xander Song wrote: > How can I unsubscribe from the Apache Flink user mailing list? I have > tried emailing user-unsubscr...@flink.apache.org, but am still receiving > messages. > > Thank

callback by using process function

2021-05-06 Thread Abdullah bin Omar
Hi, According to [1] example section, (i) If we check the stored count of the last modification time against the previous timestamp count, then emit the count if they (count from last modification time) match with the previous timestamp count. Is there refere about checking the previous count? a

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-06 Thread Rui Li
Thanks to Dawid and Guowei for the great work! On Thu, May 6, 2021 at 4:48 PM Zhu Zhu wrote: > Thanks Dawid and Guowei for being the release managers! And thanks > everyone who has made this release possible! > > Thanks, > Zhu > > Yun Tang 于2021年5月6日周四 下午2:30写道: > >> Thanks for Dawid and Guowei

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-06 Thread Zhu Zhu
Thanks Dawid and Guowei for being the release managers! And thanks everyone who has made this release possible! Thanks, Zhu Yun Tang 于2021年5月6日周四 下午2:30写道: > Thanks for Dawid and Guowei's great work, and thanks for everyone involved > for this release. > > Best > Yun Tang >

Re: Flink auto-scaling feature and documentation suggestions

2021-05-06 Thread Till Rohrmann
Hi Vishal, thanks a lot for all your feedback on the new reactive mode. I'll try to answer your questions. 0. In order to avoid confusion let me quickly explain a bit of terminology: The reactive mode is the new feature that allows Flink to react to newly available resources and to make use of th

Re: Session Windows - not working as expected

2021-05-06 Thread Swagat Mishra
thank you i wil have a look at datasteeam.global is there any other way to maintain state like by using valuestate. On Thu, 6 May 2021 at 1:26 PM, Arvid Heise wrote: > If you keyby then all direct functions see only the elements with the same > key. So that's the expected behavior and the bas

Re: Session Windows - not working as expected

2021-05-06 Thread Arvid Heise
If you keyby then all direct functions see only the elements with the same key. So that's the expected behavior and the base of Flink's parallel processing capabilities. If you want to generate a window over all customers, you have to use a global window. However, that also means that no paralleli

Unsubscribe

2021-05-06 Thread Xander Song
How can I unsubscribe from the Apache Flink user mailing list? I have tried emailing user-unsubscr...@flink.apache.org, but am still receiving messages. Thank you.

No result shown when submitting the SQL in cli

2021-05-06 Thread tao xiao
Hi team, I wrote a simple SQL job to select data from Kafka. I can see results printing out in IDE but when I submit the job to a standalone cluster in CLI there is no result shown. I am sure the job is running well in the cluster with debug log suggesting that the kafka consumer is fetching data

some questions about data skew

2021-05-06 Thread jester jim
Hi, I have run a program to monitor the sum of the delay in every minutes of a stream,this is my code: .map(new RichMapFunction[String,(Long,Int)] { override def map(in: String): (Long,Int) = { var str:String = "" try { val arr = in.split("\\|") ((System.currentTime

Re: Session Windows - not working as expected

2021-05-06 Thread Swagat Mishra
Thank you. sourceContext.collectWithTimestamp(c, c.getEventTime()); Adding this to the source context worked. However I am still getting only one customer in the process method. i would expect the iterable to provide all customers in the window. or do i have to maintain state. changes for refe

Re: Session Windows - not working as expected

2021-05-06 Thread Arvid Heise
Your source is not setting the timestamp with collectWithTimestamp. I'm assuming that nothing really moves from the event time's perspective. On Thu, May 6, 2021 at 8:58 AM Swagat Mishra wrote: > Yes customer generator is setting the event timestamp correctly like I see > below. I debugged and f