Re:Re: Window Aggregation and Window Join ability not work properly

2021-12-22 Thread cy
I change to watermark for `datatime` as `datatime` - interval '1' second or watermark for `datatime` as `datatime` but is still not work. At 2021-12-23 15:16:20, "Yun Gao" wrote: Hi Caiyi, The window need to be finalized with watermark[1]. I noticed that the watermark defined

Re: Window Aggregation and Window Join ability not work properly

2021-12-22 Thread Yun Gao
Hi Caiyi, The window need to be finalized with watermark[1]. I noticed that the watermark defined is `datatime` - INTERVAL '5' MINUTE, it means the watermark emitted would be the maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to trigger the window of 16:00 ~ 16:

Re: Operator state in New Source API

2021-12-22 Thread Yun Gao
Hi Krzysztof, If I understand right, I think managed operator state might not help here since currently Flink only support in-memory operator state. Is it possible currently we first have a customized SplitEnumerator to skip the processed files in some other way? For example, if these files hav

Re: Operator state in New Source API

2021-12-22 Thread Yun Tang
Hi Krzysztof, Non-keyed operator state only supports list-like state [1] as there exist no primary key in operator state. That is to say you cannot use map state in source operator. [1] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/state/#using

Operator state in New Source API

2021-12-22 Thread Krzysztof Chmielewski
Hi, Is it possible to use managed operator state like MapState in an implementation of new unified source interface [1]. I'm especially interested with using Managed State in SplitEnumerator implementation. I have a use case that is a variation of File Source where I will have a great number of fi

[ANNOUNCE] Apache Flink Stateful Functions 3.1.1 released

2021-12-22 Thread Igal Shilman
The Apache Flink community is very happy to announce the release of Apache Flink Stateful Functions (StateFun) 3.1.1. This is a bugfix release that addresses the recent log4j vulnerabilities, users are encouraged to upgrade. StateFun is a cross-platform stack for building Stateful Serverless appl

Re: Waiting on streaming job with StatementSet.execute() when using Web Submission

2021-12-22 Thread Yuval Itzchakov
Hi Caizhi, This is our program printing out the status code, but it doesn't really matter. The point is that I have no ability to run a StatementSet through the WebSubmission Rest API without blocking ATM. On Wed, Dec 22, 2021 at 1:39 PM Caizhi Weng wrote: > Hi! > > This "Finished successfully w

Re: Avoiding Dynamic Classloading for User Code

2021-12-22 Thread David Morávek
1. Yes, I'm not aware of a way to avoid it right now when you're submitting job via REST API. 2. Hopefully not, the classes should be always loaded from the parent loader if they can be found on classpath ... but as I've told you before, this is a hacky solution which is healing symptoms instead of

Re: Avoiding Dynamic Classloading for User Code

2021-12-22 Thread Lior Liviev
Hello David, I have some questions regarding our conversation: 1. When I put the JAR in $FLINK/lib, do I need to use your REST API to load it? 2. If I have my JAR in the folder AND I load same JAR via REST API, will I run into problems? (class loading strategy is set to parent-first) ___

log4j2 upgrade requirement

2021-12-22 Thread Puneet Duggal
Hi, Context: - I am using flink 1.12.1 version for real time event processing. This flink uses log4j 2.12.1 version. But jar that i am uploading uses 2.17.0. Now my assumption is that flink being generic in nature, does not log event specific data , logging it is responsibility of user specific

Re: Flink Checkpoint Duration/Job Throughput

2021-12-22 Thread Caizhi Weng
Hi! I see that there is no keyBy in your user code. Is it the case that some Kafka partitions contain a lot more data than others? If so, you can try datastream.rebalance() [1] to rebalance the data between each parallelism and reduce the impact of data skew. [1] https://nightlies.apache.org/flin

Re: Waiting on streaming job with StatementSet.execute() when using Web Submission

2021-12-22 Thread Caizhi Weng
Hi! This "Finished successfully with value: 0" seems quite suspicious. If you search through the code base no log is printing such information. Could you please check which component is printing this log and determine which process this exit code belongs to? Yuval Itzchakov 于2021年12月22日周三 15:48写

Re: CVE-2021-44228 - Log4j2 vulnerability

2021-12-22 Thread David Morávek
Hi Debraj, we're currently not planning another emergency release as this CVE is not as critical for Flink users as the previous one. However, this patch will be included in all upcoming patch & minor releases. The patch release for the 1.14.x branch is already in progress [1] (it may be bit delay

Re: [DISCUSS] Changing the minimal supported version of Hadoop

2021-12-22 Thread David Morávek
Agreed, if we drop the CI for lower versions, there is actually no point of having safeguards as we can't really test for them. Maybe one more thought (it's more of a feeling), I feel that users running really old Hadoop versions are usually slower to adopt (they most likely use what the current H

Re: Flink Checkpoint Duration/Job Throughput

2021-12-22 Thread Terry Heathcote
Hi Caizhi Thank you for the response. Below is relevant code for the pipeline as requested, along with the Kafka properties we set for both the FlinkKafkaProducer and Consumer. The operator that suffers the data skew are the sinks. import Models.BlueprintCacheDataType; import PipelineBlocks.*; im

JsonRowSerializationSchema unable to parse TIMESTAMP_LTZ fields

2021-12-22 Thread Surendra Lalwani
Hi Team, JsonRowSerializationSchema is unable to parse fields with type TIMESTAMP_LTZ, seems like this is not handled properly. While trying to fire a simple query Select current_timestamp from table_name , it gives error that Could not serialize row and asks to add shaded flink dependency for jsr

Re: Kryo EOFException: No more bytes left

2021-12-22 Thread Flavio Pompermaier
Hi Dan, in my experience this kind of errors are caused by some other problem that's not immediately obvious (like some serialization, memory or RocksDB problem). Could it be that an Avro field cannot be null or viceversa? On Tue, Dec 21, 2021 at 7:21 PM Dan Hill wrote: > I was not able to repro