Re: kerberos token expire

2021-07-06 Thread Gabor Somogyi
As Yangze stated ticket cache will be expired after its lifespan. Please be aware that when keytab is used then Flink obtains delegation tokens which will be never ever used. The fact that delegation token handling is not functioning is a known issue and working on it to fix it. w/o delegation toke

Re: mutual authentication with ssl

2021-11-26 Thread Gabor Somogyi
Hi Raul, On all systems keystore is needed normally on the server side and truststore on client side. As a result it's highly advised to use different config files in these places. It's easy to see why it would be a security leak when keystore would be available in client side (client can fake a

Re: Question about plain password in flink-conf.yaml

2022-01-18 Thread Gabor Somogyi
export SSL_PASSWORD=secret flink run -yDsecurity.ssl.rest.*-password=$SSL_PASSWORD ... app.jar Such way the code which starts the workload can store the passwords in a centrally protected area. This still can be hacked but at least not stored in plain text file. BR, G On Tue, Jan 18, 2022 at 10

Re: Reply:DelegationTokenManager

2022-06-21 Thread Gabor Somogyi
Thanks for pinging me! Yes, this is my main target to finish this feature however there are major code parts which are still missing. Please have a look at the umbrella jira to get better understanding: https://issues.apache.org/jira/browse/FLINK-21232 In general it's not advised to use it for pro

Re: Re: [ANNOUNCE] Apache Flink 1.15.1 released

2022-07-12 Thread Gabor Somogyi
Flink tried to create the following dir: tm_localhost:50329-fc0146 Colon is allowed on linux but not on windows and that's the reason of the exception. BR, G On Tue, Jul 12, 2022 at 11:30 AM wrote: > ... > 2022-07-12 11:25:08,448 INFO > akka.remote.Remoting

Re: Re: [ANNOUNCE] Apache Flink 1.15.1 released

2022-07-12 Thread Gabor Somogyi
In order to provide a hotfix please set "taskmanager.resource-id" to something which doesn't contain special any character. G On Tue, Jul 12, 2022 at 11:59 AM Gabor Somogyi wrote: > Flink tried to create the following dir: tm_localhost:50329-fc0146 > Colon is allowe

Re: Flink application mode/S3A Exception after upgrading from to Flink 1.16.0

2023-01-27 Thread Gabor Somogyi
The min supported version was 2.8.5 but in 1.17 it's gonna be 2.10.1 so one can downgrade. G On Fri, Jan 27, 2023, 20:42 Leon Xu wrote: > Thank you Mate. > Yeah this looks like the root cause. A follow-up question, do you know if > Flink 1.16 will have a hard dependency on Hadoop 3.3.x? or can

Re: Delegation Tokens config - Upgrade from 1.16.x to 1.17.0

2023-04-07 Thread Gabor Somogyi
Hi Arthur, Delegation tokens were enabled all the time which is not changed since it would be a breaking change. I would personally turn it off by default but it's important to keep original behavior. The manager is loading providers at the very beginning of the init process. It loads and initial

Re: Facing issue when using S3 in Flink 1.17

2023-04-20 Thread Gabor Somogyi
Hi Sriram, This has been fixed in https://issues.apache.org/jira/browse/FLINK-31839 G On Thu, Apr 20, 2023 at 4:57 PM Sriram Ganesh wrote: > Hi Team, > > I am using S3 as FileSystem to write data from Flink. I am getting the > below error in Flink 1.17. The same code works in Flink 1.16. Coul

Re: Encryption of parameters in flink-conf.yaml

2023-05-09 Thread Gabor Somogyi
hi Anuj, As Martijn said IAM is the preferred option but if you've no other way than access keys then environment variables is a better choice. Such case conf doesn't contain plain text keys. Just a side note, putting `s3a.access.key` into Flink conf file is not configuring Hadoop S3. The way how

Re: Keytab Setup on Kubernetes

2023-09-05 Thread Gabor Somogyi
hi Chirag, Flink now supports 2 ways to have TGT which is a Kerberos ticket and has nothing to do with the "until 7 days renewable" HDFS TGS ticket (with default config). * Keytab: if one mounts a keytab for at least the JobManager pod then it can create TGT infinitely (or until the user's passwo

Re: Keytab Setup on Kubernetes

2023-09-07 Thread Gabor Somogyi
o applicable in Flink 1.16? > > Thanks > > On Tuesday, 5 September, 2023 at 07:15:07 pm IST, Gabor Somogyi < > gabor.g.somo...@gmail.com> wrote: > > > hi Chirag, > > Flink now supports 2 ways to have TGT which is a Kerberos ticket and has > nothing to do with

Re: Securing Keytab File in Flink

2023-09-15 Thread Gabor Somogyi
Hi Chirag, Couple things can be done to reduce the attack surface (including but not limited to): * Use delegation tokens where only JM needs the keytab file: https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/security/security-delegation-token/ * Limit the access rights of the k

Re: SecurityManager in Flink

2024-03-06 Thread Gabor Somogyi
Hi Kirti, Not sure what is the exact issue here but I'm not convinced that having FlinkSecurityManager is going to solve it. Here is the condition however: * cluster.intercept-user-system-exit != DISABLED (this must be changed) * cluster.processes.halt-on-fatal-error == false (this is good by defa

Re: Setting uid hash for non-legacy sinks

2024-06-07 Thread Gabor Somogyi
Hi Salva, Just wondering why not good to set the uid like this? ``` output.sinkTo(outputSink).uid("my-human-readable-sink-uid"); ``` >From the mentioned UID Flink is going to make the hash which is consistent from UID -> HASH transformation perspective. BR, G On Fri, Jun 7, 2024 at 7:54 AM Sa

Re: Setting uid hash for non-legacy sinks

2024-06-09 Thread Gabor Somogyi
ply the > same strategy for generating uids to compute the corresponding uidHash for > each suboperator. Maybe you can further investigate it and fire a JIRA > issue on it. > > Best, > Zhanghao Chen > -- > *From:* Salva Alcántara > *Sent:* Sunday, June

Re: Setting uid hash for non-legacy sinks

2024-06-10 Thread Gabor Somogyi
YW, ping me back whether it works because it's a nifty feature. G On Mon, Jun 10, 2024 at 9:26 AM Salva Alcántara wrote: > Thanks Gabor, I will give it a try! > > On Mon, Jun 10, 2024 at 12:01 AM Gabor Somogyi > wrote: > >> Now I see the intention and then you

Re: Setting uid hash for non-legacy sinks

2024-06-12 Thread Gabor Somogyi
> Salva > > On Mon, Jun 10, 2024 at 9:49 AM Gabor Somogyi > wrote: > >> YW, ping me back whether it works because it's a nifty feature. >> >> G >> >> On Mon, Jun 10, 2024 at 9:26 AM Salva Alcántara >> wrote: >> >>> Thanks Gabor,

Re: Reminder: Help required to fix security vulnerabilities in Flink Docker image

2024-06-23 Thread Gabor Somogyi
Hi Elakiya, I've just double checked the story and seems like the latest 1.17 gosu release is not vulnerable. Can you please try it out on your side? Alexis has written down how you can bump the docker version locally: ---CUT-HERE--- ENV GOSU_VERSION 1.17 ---CUT-HERE--- Please report back and we

Re: Flink write ADLS show error: No FileSystem for scheme "file"

2024-06-26 Thread Gabor Somogyi
Hi Xiao, I'm not quite convinced that the azure plugin ruined your workload, I would take a look at the dependency graph you've in the pom. Adding multiple deps can conflict in terms of class loader services (src/main/resources/META-INF/services). As an example you've 2 such dependencies where or

Re: Flink write ADLS show error: No FileSystem for scheme "file"

2024-07-02 Thread Gabor Somogyi
> > > > > > > > And like my reply in stackoverflow, I found the hadoop-common file : > https://github.com/apache/hadoop/blob/release-3.3.4-RC1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L3374 &g

Re: Parallelism of state processor jobs

2024-07-05 Thread Gabor Somogyi
Hi Alexis, It depends. When one uses SavepointLoader to read metadata only then it's non-parallel. SavepointReader however is basically a normal batch job with all its features. G On Fri, Jul 5, 2024 at 5:21 PM Alexis Sarda-Espinosa < sarda.espin...@gmail.com> wrote: > Hello, > > Really quick

Re: SavepointReader: Record size is too large for CollectSinkFunction

2024-07-29 Thread Gabor Somogyi
I've double checked and I think that CollectSinkOperatorFactory is initialized in DataStream.collectAsync without MAX_BATCH_SIZE and SOCKET_TIMEOUT values coming from the Flink config. Could you plz share the whole stacktrace to double check my assumption? G On Tue, Jul 23, 2024 at 12:46 PM Salv

Re: Expose rocksdb options for flush thread.

2024-07-29 Thread Gabor Somogyi
This has been not moved for a while so assigned to you. G On Mon, Jul 15, 2024 at 9:06 AM Zhongyou Lee wrote: > Hellow everyone : > > Up to now, To adjuest rocksdb flush thread the only way is implement > ConfigurableRocksDBOptionsFactory #setMaxBackgroundFlushes by user. I found > FLINK-22059

Re: Restore rocksDB from savepoint exception

2024-08-08 Thread Gabor Somogyi
Hi Bjarke, It's normal to see longer recovery time as the state size grows. There are developments in progress to mitigate this problem. > Any ideas as to what could be causing this? I think there is no easy way to tell it, I can give just some pointers. First I would take a look at the state fi

Re: Flink jobs failed in Kerberos delegation token

2024-08-13 Thread Gabor Somogyi
Hi, As a general suggestion please use the official releases since we're not able to analyze any kind of custom code with cherry-picks with potential hand made conflict resolution. When you state "needs to be restarted due to an exception" then what kind of exception you're are referring to? I me

Re: Flink jobs failed in Kerberos delegation token

2024-08-14 Thread Gabor Somogyi
ken* > > *(*FLIP-211: Kerberos delegation token framework - Apache Flink - Apache > Software Foundation > <https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework> > *)* > > > You can verify this Kerberos scenario, thank you. >

Re: Flink jobs failed in Kerberos delegation token

2024-08-14 Thread Gabor Somogyi
e is a Yarn parameter called > `yarn.resourcemanager.proxy-user-privileges.enabled` that seems to be > suitable for long-running applications. Are you familiar with this > parameter? Can it be used to solve Flink's issue? > > > -- > > > > >

Re: Restore rocksDB from savepoint exception

2024-08-19 Thread Gabor Somogyi
rocksDB tuning and reducing state size if possible (better eviction, etc...). > > Best regards, > Bjarke > > On Thu, Aug 8, 2024 at 11:16 AM Gabor Somogyi > wrote: > >> Hi Bjarke, >> >> It's normal to see longer recovery time as the state size grows. There >

Re: Restore rocksDB from savepoint exception

2024-08-21 Thread Gabor Somogyi
fs.core.windows.net/test/checkpointing > state.savepoints.dir: abfss://@. > dfs.core.windows.net/test/savepoints > taskmanager.network.memory.buffer-debloat.enabled: "true" > taskmanager.numberOfTaskSlots: "1" > > Best regards, > Bjarke > &

Re: Kafka connector exception restarting Flink 1.19 pipeline

2024-09-03 Thread Gabor Somogyi
Hi Dominic, The issue has nothing to do with DynamicKafkaSource. The scenario what happens here is clear: * At some point in time you've used the 3.2 Kafka connector which writes out it's state with v2 serializer * Later falled back to a version which is not able to read that (pre 3.1.0 because t

Re: Kafka connector exception restarting Flink 1.19 pipeline

2024-09-03 Thread Gabor Somogyi
to reload the state with an old > dependency (which I cannot even find on the classpath) 😲 > > > > > *Dominik Bünzli *Data, Analytics & AI Engineer III > > > > *From: *Gabor Somogyi > *Date: *Tuesday, 3 September 2024 at 12:59 > *To: *Bünzli

Re: OperatorStateFromBackend can't complete initialisation because of high number of savepoint files reads

2024-10-15 Thread Gabor Somogyi
[] - > Sending Request: GET > https://.../savepoints/flink-compression/.../savepoint-... > Range: bytes=0-9223372036854775806. > ``` > > For uncompressed state could you please let us know how the change from > your PR eliminates the multiple calls to S3

Re: OperatorStateFromBackend can't complete initialisation because of high number of savepoint files reads

2024-10-15 Thread Gabor Somogyi
as a > byte[] and do the reads from memory ... but it seems the state compression > is doing the same. We are in the process of testing state compression under > production volumes ... can't say how that will actually work for us. > > Thank you again for looking into this.

Re: Flink job can't complete initialisation because of millions of savepoint file reads

2024-10-15 Thread Gabor Somogyi
Hi Alex, Please see my comment here [1]. [1] https://lists.apache.org/thread/h5mv6ld4l2g4hsjszfdos9f365nh7ctf BR, G On Mon, Sep 2, 2024 at 11:02 AM Alex K. wrote: > We have an issue where a savepoint file containing Kafka topic partitions > offsets is requested millions of times from AWS S3.

Re: OperatorStateFromBackend can't complete initialisation because of high number of savepoint files reads

2024-10-15 Thread Gabor Somogyi
Hi William, It's a bit old question but I think now we know why this is happening. Please see [1] for further details. It's an important requirement to use uncompressed state because even with the fix compressed state is still problematic. We've already tested the PR with load but if you can repo

Re: OperatorStateFromBackend can't complete initialisation because of high number of savepoint files reads

2024-10-17 Thread Gabor Somogyi
some time since our checkpoints can reach >> more than 10GB (uncompressed). >> >> Let me know if you have any updates. I will let you know if I observe >> anything else. >> Please let us know if you have any new findings. >> >> Thank you. >> >> On Tue, Oct

Re: OperatorStateFromBackend can't complete initialisation because of high number of savepoint files reads

2024-10-18 Thread Gabor Somogyi
ation performing well after applied my PR [1] After it's going to be merged we can say both compressed and uncompressed state is safe to use. Thanks for everybody for the efforts to sort this out! [1] https://github.com/apache/flink/pull/25509 G On Thu, Oct 17, 2024 at 11:06 PM Gabor Somo

Re: setUidHash not working (???)

2024-09-24 Thread Gabor Somogyi
(uidHash)) .findFirst(); assertEquals(uidHash, pair.get().getUserDefinedOperatorID().get().toString()); Please load your savepoint with SavepointLoader and analyze what kind of operators are inside and why is your app blowing up. G On Tue, Sep 24, 2024 at 9:29 AM Gabor Somogy

Re: Help regarding Flink Stateful API

2024-10-01 Thread Gabor Somogyi
s/egress > 4) How can I create a docker image of flink stateful APIs, I was unable to > decouple the source code and build a docker image. I used the playground > below. > > https://github.com/apache/flink-statefun-playground/tree/release-3.3/playground-internal > > Best Regards

Re: Help regarding Flink Stateful API

2024-10-01 Thread Gabor Somogyi
Hi Nitin, Flink applications can be started locally (run the main) for example from Intellij or any other similar IDE. Important note that such case the execution path is different but it's convenient for business logic debugging. BR, G On Tue, Oct 1, 2024 at 12:21 PM Nitin Chauhan wrote: > H

Re: setUidHash not working (???)

2024-09-24 Thread Gabor Somogyi
Hi Salva, Which version is this? BR, G On Mon, Sep 23, 2024 at 8:46 PM Salva Alcántara wrote: > I have a pipeline where I'm trying to add a new operator. The problem is > that originally I forgot to specify the uid for one source. To remedy this, > I'm using setUidHash, providing the operator

Re: "Not all slot managed memory is freed"

2024-11-06 Thread Gabor Somogyi
Hi Jad, There is no low hanging fruit here if you really want to find this out. Such case the memory manager tries to allocate and deallocate the total memory which is prepared for. When not all the memory is available then it's not going to be successful and you see the mentioned exception. I wo

Re: SavepointReader: Record size is too large for CollectSinkFunction

2024-12-05 Thread Gabor Somogyi
I'm intended to file a jira + PR. G On Thu, Dec 5, 2024 at 2:40 PM Salva Alcántara wrote: > So what's next? Do you want me to do something Gabor? > > Regards, > > Salva > > On Thu, Dec 5, 2024, 13:48 Gabor Somogyi > wrote: > >> Based on t

Re: SavepointReader: Record size is too large for CollectSinkFunction

2024-12-05 Thread Gabor Somogyi
op(MailboxProcessor.java:231) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807) > at > org.apache.flink.runtime.taskmanager.Task.ru

Re: SavepointReader: Record size is too large for CollectSinkFunction

2024-12-08 Thread Gabor Somogyi
test > your changes locally. If still necessary, let me know and I'll find the > time! > > > On Fri, Dec 6, 2024 at 1:49 PM Gabor Somogyi > wrote: > >> Salva, I've tested the code by myslelf but do you have the possibility to >> test this[1] fix? >> &

Re: SavepointReader: Record size is too large for CollectSinkFunction

2024-12-06 Thread Gabor Somogyi
Salva, I've tested the code by myslelf but do you have the possibility to test this[1] fix? [1] https://github.com/apache/flink/pull/25755 G On Thu, Dec 5, 2024 at 3:15 PM Gabor Somogyi wrote: > I'm intended to file a jira + PR. > > G > > > On Thu, Dec 5, 202

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Gabor Somogyi
ch entry and extract the bit we > want in that scenario. > > Never mind > > Thank you for the insight. Saves me a lot of hunting for nothing. > > JM > > On Tue, Feb 4, 2025 at 10:45 AM Gabor Somogyi > wrote: > >> Hi Jean-Marc, >> >> We've a

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Gabor Somogyi
d all other aspects. G On Tue, Feb 4, 2025 at 2:26 PM Gabor Somogyi wrote: > Please report back on how the patch behaves including any side effects. > > Now I'm in testing the state reading with processor API vs the mentioned > job where we control the keys. > The differen

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Gabor Somogyi
Hi Jean-Marc, We've already realized that the RocksDB approach is not reaching the performance criteria which it should be. There is an open issue for it [1]. The hashmap based approach was and is always expecting more memory. So if the memory footprint is a hard requirement then RocksDB is the on

Re: How to read a savepoint fast without exploding the memory

2025-02-04 Thread Gabor Somogyi
b performance in an acceptable range for us we > might go that way. I really like the light memory consumption of RockDb for > that kind of side job. > > Thanks > > JM > > On Tue, Feb 4, 2025 at 12:23 PM Gabor Somogyi > wrote: > >> What I could imagine is to

Re: How to read a savepoint fast without exploding the memory

2025-02-05 Thread Gabor Somogyi
> controlled environment. but I think it's conclusive enough. I will need to > run further tests but I think we will patch our Flink. using a system > property to configure it. > > Hope this help > > JM > > On Tue, Feb 4, 2025 at 4:01 PM Gabor Somogyi > wrote: > &g

Re: How to read a savepoint fast without exploding the memory

2025-02-06 Thread Gabor Somogyi
sm 1, and only 1 task manager) . I also know > it has been created by a job using a HashMap backend. And I do not care > about duplicates. > > I should still be good, right? from what I saw I never read any duplicate > keys. > > Thanks > > JM > > > On Wed, Feb 5,

Re: Skip Restore Incompatibility of Key Serializer for Savepoint

2025-02-10 Thread Gabor Somogyi
as > wondering if you could share more in - depth details about this > configuration. > Your insights would be of great help to me in resolving this issue. Thank > you very much in advance. > Best regards,Leilinee > > 获取 Outlook for iOS <https://aka.ms/o0ukef> > --

Re: ssl certificate reload capabilities

2025-02-10 Thread Gabor Somogyi
Hi Nicolas, This is only supported by the operator but not by Flink. It's on my list but not yet reached this to actually do it. BR, G On Mon, Feb 10, 2025 at 10:41 AM Nicolas Fraison via user < user@flink.apache.org> wrote: > Hi, > > We are looking into enabling SSL on our flink jobs. > Follo

Re: How to read a savepoint fast without exploding the memory

2025-02-07 Thread Gabor Somogyi
rator and 2 value states - Old execution time: 25M27.126737S - New execution time: 1M19.602042S In short: ~95% performance gain. G On Thu, Feb 6, 2025 at 9:06 AM Gabor Somogyi wrote: > In short, when you don't care about > multiple KeyedStateReaderFunction.readKey calls then yo

Re: How to read a savepoint fast without exploding the memory

2025-02-05 Thread Gabor Somogyi
ere any plans on a migration guide or something for users to adapt > their QS observers (beyond the current docs)? (State-)Observability-wise > Flink has some room for improvement I would say. > > Regards, > > Salva > > > On Wed, Feb 5, 2025 at 9:36 AM Gabor Somogyi >

Re: How to read a savepoint fast without exploding the memory

2025-02-10 Thread Gabor Somogyi
Hi Jean-Marc, FYI, I've just opened this [1] PR to address the issue in a clean way. May I ask you to test it on your side? [1] https://github.com/apache/flink/pull/26134 BR, G On Fri, Feb 7, 2025 at 6:14 PM Gabor Somogyi wrote: > Just a little update on this. We've made our f

Re: How to read a savepoint fast without exploding the memory

2025-02-11 Thread Gabor Somogyi
utFormatSourceFunction.run(InputFormatSourceFunction.java:90) > ~[flink-runtime-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:107) > ~[flink-runtime-2.1-SNAPSHOT.jar:2.1-SNAPSHOT] > at > org.apache.flink.strea

Re: How to read a savepoint fast without exploding the memory

2025-02-12 Thread Gabor Somogyi
s with your patch, which is exactly what I am > expecting. If you have more specific concerns I suppose I can instrument my > code further. I have no expectation regarding the key ordering. > > JM > > > On Wed, Feb 12, 2025 at 11:29 AM Gabor Somogyi > wrote: > >> I t

Re: How to read a savepoint fast without exploding the memory

2025-02-12 Thread Gabor Somogyi
t; at >>> com.ibm.aiops.lifecycle.policy.execution.state.RequestExecutionStateManager.(RequestExecutionStateManager.java:52) >>> ~[classes/:?] >>> at >>> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native >>> Meth

Re: SavepointReader: Record size is too large for CollectSinkFunction

2024-12-11 Thread Gabor Somogyi
package -DskipTests > > runs fine in my terminal. > > Regards, > > Salva > > On Sun, Dec 8, 2024 at 12:06 PM Gabor Somogyi > wrote: > >> It would be good to report back whether it's solved your issue, but since >> we've added automated test + I&#

Re: AWS S3 - Sink and StateBackend

2024-12-18 Thread Gabor Somogyi
Hi Patricia, Different S3 buckets can be accessed with different S3A client configurations. This allows for different endpoints, data read and write strategies, as well as login details. A bucket s3a://nightly/ used for nightly data can then be given a session key: fs.s3a.bucket.nightly.access.key

Re: How to read a savepoint fast without exploding the memory

2025-02-14 Thread Gabor Somogyi
learning > exercise for us. > > Kind regards > > JM > > > > On Wed, Feb 12, 2025 at 12:57 PM Gabor Somogyi > wrote: > >> Hi Jean-Marc, >> >> Thanks for your efforts! We've done quite extensive tests inside and they >> are all passing. Goo

Re: Issues with state corruption in RocksDB

2025-02-14 Thread Gabor Somogyi
> signature_561365254] <https://www.facebook.com/Genesys/>[image: > signature_262758434] <https://www.youtube.com/Genesys>[image: > signature_2090310858] <http://blog.genesys.com/> > > > > > > > > > > *From: *Gabor Somogyi > *Date: *Friday, F

Re: Skip Restore Incompatibility of Key Serializer for Savepoint

2025-02-08 Thread Gabor Somogyi
Hi leilinee, Since Flink SQL is not supporting operator uid configuration there are only hard ways. State operator uid change + `execution.savepoint.ignore-unclaimed-state=true` would be the easy way. Depending on you constraints either you use the state processor API and remove the operator whic

Re: Issues with state corruption in RocksDB

2025-02-14 Thread Gabor Somogyi
BTW, do you have a bare minimum application which can reproduce the issue? On Fri, Feb 14, 2025 at 5:07 PM Gabor Somogyi wrote: > Hi Peter, > > I've had a look on this and seems like this problem or a similar one still > exists in 1.20. > Since I've not done in-dept

Re: Issues with state corruption in RocksDB

2025-02-14 Thread Gabor Somogyi
Hi Peter, I've had a look on this and seems like this problem or a similar one still exists in 1.20. Since I've not done in-depth analysis I can only say this by taking a look at the last comments, like: > Faced an issue on Flink 1.20.0 but there are more of such... So all in all I've the feelin

Re: How can we read checkpoint data for debugging state

2025-02-21 Thread Gabor Somogyi
n > > > On Fri, Feb 21, 2025 at 3:37 PM Gabor Somogyi > wrote: > >> The UID must match in the Flink app `events.uid("my-uid")` and in the >> reader app `forUid("my-uid")`. >> In general it's a good practice to set uid in the Flink app for each

Re: How can we read checkpoint data for debugging state

2025-02-21 Thread Gabor Somogyi
The UID must match in the Flink app `events.uid("my-uid")` and in the reader app `forUid("my-uid")`. In general it's a good practice to set uid in the Flink app for each and every operator, otherwise Flink generates an almost random number for it. When you don't know the generated uid then no worr

Re: How can we read checkpoint data for debugging state

2025-02-21 Thread Gabor Somogyi
Hi Sachin, You're on the right track in general and this should work :) I know that version upgrade can be complex but it worth to do so because we've just added a ~95% performance boost for the state processor api in 1.20 [1]. As you mentioned the SQL connector is on 2.x which is not yet feasible

Re: How can we read checkpoint data for debugging state

2025-02-26 Thread Gabor Somogyi
te("Analysis"); > > > Any idea as to what could be going wrong. > Also note this is my checkpoint config: > state.backend.type: rocksdb > state.checkpoints.dir: s3://{bucket}/flink-checkpoints > state.backend.incremental: 'true' > state.backend.local-recovery: 'tr

Re: How can we read checkpoint data for debugging state

2025-02-26 Thread Gabor Somogyi
ate one of my operators writes to, to > understand if there is a state leak and why this state becomes very huge. > > Thanks > Sachin > > > On Wed, Feb 26, 2025 at 5:01 PM Gabor Somogyi > wrote: > >> State processor API is now not supporting changelog. Plea

Re: Is incremental checkpointing append-only (forever growing) on s3?

2025-02-19 Thread Gabor Somogyi
Hi Vadim, In short, state eviction is not applied to incremental checkpoints from storage perspective but there are techniques to configure Flink to do it. I think you look for state TTL feature which can resolve your problem [1]. [1] https://flink.apache.org/2019/05/17/state-ttl-in-flink-1.8.0-h

Re: Skip Restore Incompatibility of Key Serializer for Savepoint

2025-03-16 Thread Gabor Somogyi
e that couldn't be > restored by the operator ID. Finally, I used the API to rewrite the new > savepoint file, replacing the old one, the new Job can restor state and > running well. > > Best regards, leilinee > > ------ > *发件人:* Gabor Somogyi > *

Re: Clarification on checkpoint/savepoint usage with Kafka (and RabbitMQ)

2025-03-26 Thread Gabor Somogyi
> Best Regards, > > Le mer. 26 mars 2025 à 07:18, Gabor Somogyi a > écrit : > >> In short it's encouraged to use savepoint because of the following >> situation: >> * You start processing data from offset 0 >> * 2 savepoints created, one with offset 10, a

Re: ssl certificate reload capabilities

2025-03-26 Thread Gabor Somogyi
possible to merge it to Flink and how I should proceed (just push a PR > to flink repo or should I need a FLIP for such change)? > > Regards, > Nicolas > > On Mon, Feb 10, 2025 at 11:27 AM Gabor Somogyi > wrote: > >> Hi Nicolas, >> >> This is only supported b

Re: Slack invitation

2025-03-26 Thread Gabor Somogyi
Hi Or, Is the link in the official place [1] not working? [1] https://flink.apache.org/what-is-flink/community/#slack BR, G On Wed, Mar 26, 2025 at 9:44 AM Or Keren wrote: > Hey, > > Can I can an invitation for your slack channel please? > > Thanks, > Or. >

Re: Clarification on checkpoint/savepoint usage with Kafka (and RabbitMQ)

2025-03-25 Thread Gabor Somogyi
In short it's encouraged to use savepoint because of the following situation: * You start processing data from offset 0 * 2 savepoints created, one with offset 10, another with 20 * This timeframe Kafka has offset 20 since that's the last processed * At offset 30 you realize that data processed bet

Re: Slack invitation

2025-03-26 Thread Gabor Somogyi
I don't have access to fix this, hope the appropriate person will pick this up... G On Wed, Mar 26, 2025 at 9:57 AM Or Keren wrote: > Hi Gabor, > > Unfortunately not. > > [image: PastedGraphic-1.png] > > Thanks for the quick reply! > > On 26 Mar 2025, at 10:54,