Ways to detect a scaling event within a flink operator at runtime

2024-05-23 Thread Chetas Joshi
Hello, On a k8s cluster, I have the flink-k8s-operator running 1.8 with autoscaler = enabled (in-place) and a flinkDeployment (application mode) running 1.18.1. The flinkDeployment i.e. the flink streaming application has a mock data producer as the source. The source generates data points every

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Péter Váry
If I understand correctly, Paimon is sending `CdcRecord`-s [1] on the wire which contain not only the data, but the schema as well. With Iceberg we currently only send the row data, and expect to receive the schema on job start - this is more performant than sending the schema all the time, but has

Pulsar connector resets existing subscription

2024-05-23 Thread Igor Basov
Hi everyone, I have a problem with how Flink deals with the existing subscription in a Pulsar topic. - Subscription has some accumulated backlog - Flink job is deployed from a clear state (no checkpoints) - Flink job uses the same subscription name as the existing one; the start curso

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Andrew Otto
Ah I see, so just auto-restarting to pick up new stuff. I'd love to understand how Paimon does this. They have a database sync action which will sync entire databases, handle schema evolution, and I'm p

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Péter Váry
I will ask Marton about the slides. The solution was something like this in a nutshell: - Make sure that on job start the latest Iceberg schema is read from the Iceberg table - Throw a SuppressRestartsException when data arrives with the wrong schema - Use Flink Kubernetes Operator to restart your

Re: issues with Flink kinesis connector

2024-05-23 Thread Nick Hecht
Thank you for your help! On Thu, May 23, 2024 at 1:40 PM Aleksandr Pilipenko wrote: > Hi Nick, > > You need to use another method to add sink to your job - sinkTo. > KinesisStreamsSink implements newer Sink interface, while addSink expect > old SinkFunction. You can see this by looking at method

Re: issues with Flink kinesis connector

2024-05-23 Thread Aleksandr Pilipenko
Hi Nick, You need to use another method to add sink to your job - sinkTo. KinesisStreamsSink implements newer Sink interface, while addSink expect old SinkFunction. You can see this by looking at method signatures[1] and in usage examples in documentation[2] [1] https://github.com/apache/flink/bl

Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory?

2024-05-23 Thread Zhanghao Chen
Hi John, Based on the Memory config screenshot provided before, each of your TM should have MaxDirectMemory=1GB (network mem) + 128 MB (framework off-heap) = 1152 MB. Nor will taskmanager.memory.flink.size and the total including MaxDirectMemory exceed pod physical mem, you may check the detail

Re: Task Manager memory usage

2024-05-23 Thread Zhanghao Chen
Hi Sigalit, For states stored in memory, they would most probably keep alive for several rounds of GC and ended up in the old gen of heap, and won't get recycled until a Full GC. As for the TM pod memory usage, most probabliy it will stop increasing at some point. You could try setting a large

Re: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Andrew Otto
Wow, I would LOVE to see this talk. If there is no recording, perhaps there are slides somewhere? On Thu, May 23, 2024 at 11:00 AM Carlos Sanabria Miranda < sanabria.miranda.car...@gmail.com> wrote: > Hi everyone! > > I have found in the Flink Forward website the following presentation: > "Self

issues with Flink kinesis connector

2024-05-23 Thread Nick Hecht
Hello, I am currently having issues trying to use the python flink 1.18 Datastream api with the Amazon Kinesis Data Streams Connector. >From the documentation https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/kinesis/ I have downloaded the "flink-connector-kin

"Self-service ingestion pipelines with evolving schema via Flink and Iceberg" presentation recording from Flink Forward Seattle 2023

2024-05-23 Thread Carlos Sanabria Miranda
Hi everyone! I have found in the Flink Forward website the following presentation: "Self-service ingestion pipelines with evolving schema via Flink and Iceberg " by Márton

Re: Would Java 11 cause Getting OutOfMemoryError: Direct buffer memory?

2024-05-23 Thread John Smith
Based on these two settings... taskmanager.memory.flink.size: 16384m taskmanager.memory.jvm-metaspace.size: 3072m Reading the docs here I'm not sure how to calculate the formula. My suspicion is that I may have allocated too much of taskmanager.memory.flink.size and the total including MaxDirectMe

kirankumarkathe...@gmail.com-unsubscribe

2024-05-23 Thread Kiran Kumar Kathe
Kindly un subscribe for this gmail account kirankumarkathe...@gmail.com

Re: Task Manager memory usage

2024-05-23 Thread Sigalit Eliazov
hi, thanks for your reply, we are storing the data in memory since it is a short term we thought that adding rocksdb will add overhead. On Thu, May 23, 2024 at 4:38 PM Sachin Mittal wrote: > Hi > Where are you storing the state. > Try rocksdb. > > Thanks > Sachin > > > On Thu, 23 May 2024 at 6:

Re: Task Manager memory usage

2024-05-23 Thread Sachin Mittal
Hi Where are you storing the state. Try rocksdb. Thanks Sachin On Thu, 23 May 2024 at 6:19 PM, Sigalit Eliazov wrote: > Hi, > > I am trying to understand the following behavior in our Flink application > cluster. Any assistance would be appreciated. > > We are running a Flink application clust

Help with monitoring metrics of StateFun runtime with prometheus

2024-05-23 Thread Oliver Schmied
Dear Apache Flink community,   I am setting up an apche flink statefun runtime on Kubernetes, following the flink-playground example: https://github.com/apache/flink-statefun-playground/tree/main/deployments/k8s. This is the manifest I used for creating the statefun enviroment: ```--- apiVersi

Task Manager memory usage

2024-05-23 Thread Sigalit Eliazov
Hi, I am trying to understand the following behavior in our Flink application cluster. Any assistance would be appreciated. We are running a Flink application cluster with 5 task managers, each with the following configuration: - jobManagerMemory: 12g - taskManagerMemory: 20g - taskMana

Re: Flink kinesis connector 4.3.0 release estimated date

2024-05-23 Thread Vararu, Vadim
That’s great news. Thanks. From: Leonard Xu Date: Thursday, 23 May 2024 at 04:42 To: Vararu, Vadim Cc: user , Danny Cranmer Subject: Re: Flink kinesis connector 4.3.0 release estimated date Hey, Vararu The kinesis connector 4.3.0 release is under vote phase and we hope to finalize the releas