Re: Using venv in Pyflink

2021-12-23 Thread Dian Fu
The installation happens in a separate process and so there should be no direct relationship between them. Could you try to take a look at whether the GC runs normally. Besides, you could also try to increase the heartbeat timeout. Regards, Dian On Fri, Dec 24, 2021 at 12:02 PM Paul Lam wrote:

Re: Using venv in Pyflink

2021-12-23 Thread Paul Lam
Hi Dian, Thanks lot for your pointer! I tried `-pyreq` and it worked as expected in general. But another problem arises when I tried to specified more dependencies in the requirements.txt. The taskmanagers are often lost(unreachable from jobmanager) during the process of pip install. I wond

Re: Flink On Native K8s hostAliases in Pod-template

2021-12-23 Thread Yang Wang
Hi, The pod template file when you submit a Flink application via "flink run-application ... -Dkubernetes.pod-template-file=/path/of/pod-template.yaml" is a *client-local* file. You do not need to bundle it into the docker image. Best, Yang 黄剑文 于2021年12月23日周四 23:00写道: > Flink version:1.13 > I

Re: Window Top N for Flink 1.12

2021-12-23 Thread Jing Zhang
Hi Jing, Please try this way, Only create one sink for final output, write the window aggregate and topN in one query, write the result of topN into the final sink. Best, Jing Zhang Jing 于2021年12月24日周五 03:13写道: > Hi Jing Zhang, > > Thanks for the reply! My current implementation is like this:

Re: Window Top N for Flink 1.12

2021-12-23 Thread Jing
Hi Jing Zhang, Thanks for the reply! My current implementation is like this: tableEnv.executeSql( "CREATE TABLE ItemDesc (item_id STRING, channel_id STRING, window_end BIGINT, num_select BIGINT) WITH ('connector' = 'kafka', 'scan.startup.mode' = 'latest-offset')" ) tableEnv.executeSql("""

Re: Re: Operator state in New Source API

2021-12-23 Thread Yun Gao
Hi Krzysztof, Sorry there are indeed no document said that the operator state is only kept in memory, but based on the current implementation it is indeed the case. And I might also need to fix one point: the Split Enumerate should be executed in the JM side inside the OperatorCoordinator, and

Flink On Native K8s hostAliases in Pod-template

2021-12-23 Thread 黄剑文
Flink version:1.13 I want to define some hosts in Flink jobmanager and taskmanager. I consult Flink official documents and find a way that define hostAlias in Pod Template to solve it. https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#pod-

Re: Avoiding Dynamic Classloading for User Code

2021-12-23 Thread David Morávek
Then I don't really know what else to suggest in this direction. This approach should work in general, if you have control over the class path and you make sure all the dependencies play well with each other, as the parent-first classloading doesn't provide you with any crutches as the child-first

Re: Avoiding Dynamic Classloading for User Code

2021-12-23 Thread Lior Liviev
We use hadoop in EMR 6.4 (if I'm not mistaken, emr has it's own version of hadoop so we don't define it) and we use flink 13.1 From: David Morávek Sent: Thursday, December 23, 2021 1:44 PM To: Lior Liviev Cc: user Subject: Re: Avoiding Dynamic Classloading for U

Re: Avoiding Dynamic Classloading for User Code

2021-12-23 Thread David Morávek
Please try to post the whole stacktrace the next time, but in general it sounds like you've conflicting avro versions on the classpath (actually I'd expect a version < 1.8 [1] in your user jar / maybe loaded from hadoop?). What Flink version are you using? Are you using Flink with Hadoop (which ver

Re: Avoiding Dynamic Classloading for User Code

2021-12-23 Thread Lior Liviev
I get this: Caused by: java.lang.NoSuchMethodError: org.apache.avro.Schema.getLogicalType()Lorg/apache/avro/LogicalType; And I'm using avro 1.10 From: David Morávek Sent: Thursday, December 23, 2021 12:37 PM To: Lior Liviev ; user Subject: Re: Avoiding Dynamic Cl

Re: Using venv in Pyflink

2021-12-23 Thread Dian Fu
Hi Paul, Currently, you need to build venv in an environment where you want to execute the PyFlink jobs. >> Also, I wonder if it’s possible for pyflink to optionally provide an automatically created venv for each pyflink job? Do you mean to create the venv during executing the job? If this is you

Re: Operator state in New Source API

2021-12-23 Thread Krzysztof Chmielewski
Thank you both, yes seems that the only option on a non keyed operate would be List State, my bad. Yun Gao, I'm wondering from where you get the information that " Flink only support in-memory operator state", can you point me to the documentation that says that? I cannot find any mention in the d

Re: Avoiding Dynamic Classloading for User Code

2021-12-23 Thread David Morávek
I guess I'd need more context to answer that. Have you checked the JM logs for more details? On Thu, Dec 23, 2021 at 9:01 AM Lior Liviev wrote: > Is there any reason I'm getting "Could not execute application" after I > put the Jar in /lib? > -- > *From:* David Moráve

Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

2021-12-23 Thread Martijn Visser
Hi, Based on the screenshot of your source data, all events have the same value for `datatime`. Is that indeed correct? Best regards, Martijn On Thu, 23 Dec 2021 at 10:16, cy wrote: > Hi, > > I execute the sql on 1.13.5 and the result is correct but no result show > on 1.14.0 and 1.14.2. > sq

Re: [ANNOUNCE] Apache Flink Stateful Functions 3.1.1 released

2021-12-23 Thread Till Rohrmann
Thanks a lot for being our release manager and swiftly addressing the log4j CVE Igal! Cheers, Till On Wed, Dec 22, 2021 at 5:41 PM Igal Shilman wrote: > The Apache Flink community is very happy to announce the release of Apache > Flink Stateful Functions (StateFun) 3.1.1. > > This is a bugfix r

Re: [DISCUSS] Changing the minimal supported version of Hadoop

2021-12-23 Thread Till Rohrmann
If there are no users strongly objecting to dropping Hadoop support for < 2.8, then I am +1 for this since otherwise we won't gain a lot as Xintong said. Cheers, Till On Wed, Dec 22, 2021 at 10:33 AM David Morávek wrote: > Agreed, if we drop the CI for lower versions, there is actually no point

Re: Window Top N for Flink 1.12

2021-12-23 Thread Jing Zhang
Hi Jing, In fact, I agree with you to use TopN [2] instead of Window TopN[1] by normalizing time into a unit with 5 minute, and add it to be one of partition keys. Please note two points when use TopN 1. the result is an update stream instead of append stream, which means the result sent might be r

Using venv in Pyflink

2021-12-23 Thread Paul Lam
Hi, The document says we could use `-pyarch` to upload python venv, but I found this is often problematic because users may not have the python binaries that fits the flink runtime environment. For example, a user may upload a venv for macOS within the project, but the Flink cluster is using

Re:Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

2021-12-23 Thread cy
Hi, I execute the sql on 1.13.5 and the result is correct but no result show on 1.14.0 and 1.14.2. sql: SELECT window_start, window_end, COUNT(*) FROM TABLE( TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES) ) GROUP BY window_start, windo

Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

2021-12-23 Thread Yun Gao
Sorry I mean 16:00:05, but it should be similar. --Original Mail -- Sender:Yun Gao Send Date:Thu Dec 23 17:05:33 2021 Recipients:cy CC:'user@flink.apache.org' Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly Hi Caiyi, I think if

Re: Re:Re: Window Aggregation and Window Join ability not work properly

2021-12-23 Thread Yun Gao
Hi Caiyi, I think if the image shows all the records, after the change we should only have the watermark at 16:05, which is still not be able to trigger the window of 5 minutes? Best, Yun --Original Mail -- Sender:cy Send Date:Thu Dec 23 15:44:23 2021 Recipie

Re: Window Top N for Flink 1.12

2021-12-23 Thread Jing Zhang
Hi Jing, I'm afraid there is no possible to Window TopN in SQL on 1.12 version because window TopN is introduced since 1.13. > I saw the one possibility is to create a table and insert the aggregated data to the table, then do top N like [1]. However, I cannot make this approach work because I nee

Window Top N for Flink 1.12

2021-12-23 Thread Jing
Hi, Flink community, Is there any existing code I can use to get the window top N with Flink 1.12? I saw the one possibility is to create a table and insert the aggregated data to the table, then do top N like [1]. However, I cannot make this approach work because I need to specify the connector f