[Flink Kubernetes Operator] How to check FlinkDeployment and FlinkSessionJob are in a healthy state?

2024-08-23 Thread hjw
Can we make a judgment based on the fields in the status? -- Best, Hjw

Re: Getting Direct buffer memory. Errors with Kafka.

2024-08-23 Thread John Smith
Online resources including my previous question to this problem said there was some client bug connecting to SSL broker that caused memory issues. As far as memory setup I have the following... Here is the link and there's a link to a JIRA... https://stackoverflow.com/questions/64697973/java-lang-

Re: Flink JOb Manager jvm metaspace not getting cleared

2024-08-23 Thread Yaroslav Tkachenko
Hi Ganesh, Have you tried following this suggestion? https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code On Fri, Aug 23, 2024 at 10:39 AM Ganesh Walse wrote: > Hi All, > > I am using apache flink for

Flink JOb Manager jvm metaspace not getting cleared

2024-08-23 Thread Ganesh Walse
Hi All, I am using apache flink for the bounded data where I need to execute 100 of jobs daily. In every job submission my job manager jvm metaspace is increasing by 10Mb and it is not getting cleared anytime so I am getting JVM OOM error after certain number of job submissions. I searched over

Re: flink kafka sink batch mode delivery guaranties limitations

2024-08-23 Thread Nicolas Paris
Thanks, missed that warning ! All right

Re: Getting Direct buffer memory. Errors with Kafka.

2024-08-23 Thread Yaroslav Tkachenko
Hi John, I've experienced this issue recently; it's likely caused either by: - the size of the producer record batch, it can be reduced by configuring lower linger.ms and batch.size values - the size of an individual record On Fri, Aug 23, 2024 at 7:20 AM Ahmed Hamdy wrote: > Why do you belie

Re: flink kafka sink batch mode delivery guaranties limitations

2024-08-23 Thread Ahmed Hamdy
Hi Nicholas, Could you elaborate what you think is missing? I can see there is a warning that EXACTLY_ONCE sink wouldn't operate. > It is important to remember that because there are no checkpoints, certain features such as CheckpointListener

Re: Getting Direct buffer memory. Errors with Kafka.

2024-08-23 Thread Ahmed Hamdy
Why do you believe it is an SSL issue? The error trace seems like a memory issue. you could refer to taskmanager memory setup guide[1]. 1- https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/mem_setup_tm/ Best Regards Ahmed Hamdy On Fri, 23 Aug 2024 at 13:47, John Smith

Re: Getting Direct buffer memory. Errors with Kafka.

2024-08-23 Thread John Smith
I'm pretty sure it's not SSL is there a way to confirm, since the take does work. And/or is there other settings I can try? On Thu, Aug 22, 2024, 11:06 AM John Smith wrote: > Hi getting this exception, a lot of resources online point to an SSL > misconfiguration. > > We are NOT using SSL. Neithe

Re: Restore rocksDB from savepoint exception

2024-08-23 Thread Bjarke Tornager
Okay, I see. Thanks for the explanation. Br, Bjarke On Wed, Aug 21, 2024 at 10:17 AM Gabor Somogyi wrote: > There is no golden egg in the RocksDB tuning area but it can help a > significantly. > > As a general rule the RocksDB state backend requires mainly 2 things to > perform well, memory a

Re: Error class RestoreMode not found after upgrading from Flink 1.19 to 1.20

2024-08-23 Thread Zakelly Lan
Hi, I suppose the solution would be to cancel the job based on Flink 1.19 and > then resubmit it with Flink 1.20. I think so. I hope this reply will be in the thread as I smartly configured to only > received digests... so sending this reply with the same titler and > hopefully it will be added

flink kafka sink batch mode delivery guaranties limitations

2024-08-23 Thread Nicolas Paris
hi >From my tests kafka sink in exactly-once and batch runtime will never commit the transaction, leading to not honour the semantic. This is likely by design since records are ack/commited during a checkpoint, which never happens in batch mode. I am missing something or the documentation should w

Error class RestoreMode not found after upgrading from Flink 1.19 to 1.20

2024-08-23 Thread Jean-Marc Paulin
Hi, We did further test and verification on this and it seems to only be an issue in HA with zookeeper. In that scenario I believe we resume from a checkpoint. I suppose the solution would be to cancel the job based on Flink 1.19 and then resubmit it with Flink 1.20. Yes we recompiled our app

Re: Error class RestoreMode not found after upgrading from Flink 1.19 to 1.20

2024-08-23 Thread Zakelly Lan
Hi Jean-Marc, I think this is related to https://issues.apache.org/jira/browse/FLINK-34455 . Could you provide more information about your setup and how you upgrade your job? Have you re-compiled your job under 1.20? Best, Zakelly On Fri, Aug 23, 2024 at 1:36 AM Jean-Marc Paulin wrote: > Hi,