Cannot set advanced RocksDB options in state processor job

2025-05-02 Thread Alexis Sarda-Espinosa
Hello, I am troubleshooting a slow state processor job with Flink 1.20.1 and I wanted to tune RocksDB to see if it helped, for example by setting [1]. However, when I run the job and look at the RocksDB logs, I can see those configurations are not actually applied even though Flink's own logs show

Basic guidance for RocksDB managed memory sizing

2025-04-15 Thread Alexis Sarda-Espinosa
Hello there, I have a very high-level question about RocksDB memory (a level-0 question, you might say). If I understand the documentation [1] correctly, the write buffers for Level 0 are always in memory, but RocksDB supports some optimizations [2], and I don't know if Flink configures that inter

Re: Flink 1.20.0 missing linux/amd64 images

2025-02-26 Thread Alexis Sarda-Espinosa
Hello, I would like to point out that the same has happened with the images for 1.20.1 Regards, Alexis. Am So., 4. Aug. 2024 um 09:45 Uhr schrieb Bjarke Tornager < bjarketorna...@gmail.com>: > Hi Weijie, > > Thanks for looking into this. It looks like the docker-hub repo was > updated with the l

Re: unexpected high mem. usage / potential config misinterpretation

2024-09-18 Thread Alexis Sarda-Espinosa
Hi Simon, I hope someone corrects me if I'm wrong, but just based on "batch mode processing terabytes of data", I feel batch mode may be the issue. I am under the impression that batch mode forces everything emitted by the sources to RAM before any downstream operators do anything, so even if each

Re: Troubleshooting checkpoint expiration

2024-08-31 Thread Alexis Sarda-Espinosa
},).*$ logger.abfs.filter.failures.onMatch = ACCEPT logger.abfs.filter.failures.onMismatch = DENY Am Mi., 7. Aug. 2024 um 12:18 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > I must ask again if anyone at least knows if Flink's file system can > expose more detailed exceptions whe

Re: Troubleshooting checkpoint expiration

2024-08-07 Thread Alexis Sarda-Espinosa
I must ask again if anyone at least knows if Flink's file system can expose more detailed exceptions when things go wrong, Azure support is asking for specific exception messages to decide how to troubleshoot. Regards, Alexis. Am Di., 23. Juli 2024 um 13:39 Uhr schrieb Alexis Sarda-Esp

Re: Using state processor for a custom windowed aggregate function

2024-08-05 Thread Alexis Sarda-Espinosa
to find > a way to > - First synthesize (map/flatmap) the new state data element (key > x window x accumulator) from the old state such that > - you can aggregate it into the new state > - (cardinalities could change) > > > > Thia

Re: Using state processor for a custom windowed aggregate function

2024-08-02 Thread Alexis Sarda-Espinosa
n, > > TypeInformation keyType, > > TypeInformation accType, > > TypeInformation outputType) > > throws IOException { > > > > > > Cheers > > > > Thias > > > > PS: will you come to the FlinkForward conferen

Re: Using state processor for a custom windowed aggregate function

2024-07-31 Thread Alexis Sarda-Espinosa
d KEY=GenericServiceCompositeKey(serviceId=X, countryCode=BAR) Why is this null? So a key is missing, and the key that was written has a null state. Regards, Alexis. Am Mi., 31. Juli 2024 um 15:45 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hi Matthias, > > This indeed compiles, I

Re: Using state processor for a custom windowed aggregate function

2024-07-31 Thread Alexis Sarda-Espinosa
> order > - Triggers will only be fired after processing all events of a > respective key are processed > - Semantics are therefore slightly different as for streaming timers > > > > Hope that helps 😊 > > > > Thias > > > > > >

Using state processor for a custom windowed aggregate function

2024-07-29 Thread Alexis Sarda-Espinosa
Hello, I am trying to create state for an aggregate function that is used with a GlobalWindow. This basically looks like: savepointWriter.withOperator( OperatorIdentifier.forUid(UID), OperatorTransformation.bootstrapWith(stateToMigrate) .keyBy(...) .window(GlobalWindows.cr

Re: Troubleshooting checkpoint expiration

2024-07-23 Thread Alexis Sarda-Espinosa
stractions? [1] https://hadoop.apache.org/docs/r3.2.4/hadoop-azure/abfs.html#Perf_Options Regards, Alexis. Am Fr., 19. Juli 2024 um 09:17 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hello, > > We have a Flink job that uses ABFSS for checkpoints and related s

Troubleshooting checkpoint expiration

2024-07-19 Thread Alexis Sarda-Espinosa
Hello, We have a Flink job that uses ABFSS for checkpoints and related state. Lately we see a lot of exceptions due to expiration of checkpoints, and I'm guessing that's an issue in the infrastructure or on Azure's side, but I was wondering if there are Flink/Hadoop Java packages that log potentia

Re: Parallelism of state processor jobs

2024-07-06 Thread Alexis Sarda-Espinosa
lpful to have more detailed information to assist > with troubleshooting, including the version of Flink in use and the > JobManager logs. > > Alexis Sarda-Espinosa 于2024年7月6日周六 15:35写道: > >> Hi Junrui, >> >> Thanks for the confirmation. I tested some more and I&#

Re: Parallelism of state processor jobs

2024-07-06 Thread Alexis Sarda-Espinosa
ader to read metadata only then >>> it's non-parallel. >>> SavepointReader however is basically a normal batch job with all its >>> features. >>> >>> G >>> >>> >>> On Fri, Jul 5, 2024 at 5:21 PM Alexis Sarda-Espinosa < >>> sarda.espin...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> Really quick question, when using the state processor API, are all >>>> transformations performed in a non-parallel fashion? >>>> >>>> Regards, >>>> Alexis. >>>> >>>>

Re: Parallelism of state processor jobs

2024-07-05 Thread Alexis Sarda-Espinosa
r.g.somo...@gmail.com>: > Hi Alexis, > > It depends. When one uses SavepointLoader to read metadata only then it's > non-parallel. > SavepointReader however is basically a normal batch job with all its > features. > > G > > > On Fri, Jul 5, 2024 at 5:21 PM Alexis S

Parallelism of state processor jobs

2024-07-05 Thread Alexis Sarda-Espinosa
Hello, Really quick question, when using the state processor API, are all transformations performed in a non-parallel fashion? Regards, Alexis.

Re: Reminder: Help required to fix security vulnerabilities in Flink Docker image

2024-06-21 Thread Alexis Sarda-Espinosa
Hi Elakiya, just to be clear, I'm not a Flink maintainer, but here my 2 cents. I imagine the issues related to Go come from 'gosu', which is installed in the official Flink Docker images. You can see [1] for some thoughts from the gosu maintainer regarding CVEs (and the md file he links). Nevert

Re: Flink Kubernetes Operator 1.8.0 CRDs

2024-05-28 Thread Alexis Sarda-Espinosa
Hello, I've also noticed this in our Argo CD setup. Since priority=0 is the default, Kubernetes accepts it but doesn't store it in the actual resource, I'm guessing it's like a mutating admission hook that comes out of the box. The "priority" property can be safely removed from the CRDs. Regards,

Re: Need help in understanding PojoSerializer

2024-03-20 Thread Alexis Sarda-Espinosa
Hi Sachin, Check the last few comments I wrote in this thread: https://lists.apache.org/thread/l71d1cqo9xv8rsw0gfjo19kb1pct2xj1 Regards, Alexis. On Wed, 20 Mar 2024, 18:51 Sachin Mittal, wrote: > Hi, > I saw the post but I did not understand how I would configure these fields > to use those s

Re: Impact of RocksDB backend on the Java heap

2024-02-19 Thread Alexis Sarda-Espinosa
e performance here. > > > Best, > Zakelly > > On Sun, Feb 18, 2024 at 7:42 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hi Zakelly, >> >> thanks for the information, that's interesting. Would you say that >> reading a subset

Re: Impact of RocksDB backend on the Java heap

2024-02-18 Thread Alexis Sarda-Espinosa
gt;> persistent store. When read, data goes to the buffers and cache of RocksDB. >> >> In the case of RocksDB as state backend, JVM still holds threads stack as >> for high degree of parallelism, there are many stacks maintaining separate >> thread information. >>

Re: Impact of RocksDB backend on the Java heap

2024-02-15 Thread Alexis Sarda-Espinosa
gt; I hope this addresses your inquiry. > > > > > On Thu, Feb 15, 2024 at 12:52 AM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hello, >> >> Most info regarding RocksDB memory for Flink focuses on what's needed >> indepen

Impact of RocksDB backend on the Java heap

2024-02-15 Thread Alexis Sarda-Espinosa
Hello, Most info regarding RocksDB memory for Flink focuses on what's needed independently of the JVM (although the Flink process configures its limits and so on). I'm wondering if there are additional special considerations with regards to the JVM heap in the following scenario. Assuming a key u

Watermark alignment without idleness

2024-02-06 Thread Alexis Sarda-Espinosa
Hello, I was reading through the comments in [1] and it seems that enabling watermark alignment implicitly activates some idleness logic "if the source waits for alignment for a long time" (even if withIdleness is not called explicitly during the creation of WatermarkStrategy). Is this time someho

Re: Request to provide sample codes on Data stream using flink, Spring Boot & kafka

2024-02-06 Thread Alexis Sarda-Espinosa
Hello, check this thread from some months ago, but keep in mind that it's not really officially supported by Flink itself: https://lists.apache.org/thread/l0pgm9o2vdywffzdmbh9kh7xorhfvj40 Regards, Alexis. Am Di., 6. Feb. 2024 um 12:23 Uhr schrieb Fidea Lidea < lideafidea...@gmail.com>: > Hi Te

Re: Idleness not working if watermark alignment is used

2024-02-06 Thread Alexis Sarda-Espinosa
sing .withIdleness(…) is IMHO only justified in rare cases where > implications are fully understood. > > > > If a source is not configured with .withIdleness(…) and becomes factually > idle, all window aggregations or stateful stream joins stall until that > source becomes active

Re: Idleness not working if watermark alignment is used

2024-02-06 Thread Alexis Sarda-Espinosa
e .idleWatermarkExcemption(…) to > make it more obvious. > > > > Hope this helps > > > > > > Thias > > > > > > > > *From:* Alexis Sarda-Espinosa > *Sent:* Monday, February 5, 2024 6:04 PM > *To:* user > *Subject:* Re: Idleness not working if

Re: Idleness not working if watermark alignment is used

2024-02-05 Thread Alexis Sarda-Espinosa
Ah and I forgot to mention, this is with Flink 1.18.1 Am Mo., 5. Feb. 2024 um 18:00 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hello, > > I have 2 Kafka sources that are configured with a watermark strategy > inst

Idleness not working if watermark alignment is used

2024-02-05 Thread Alexis Sarda-Espinosa
Hello, I have 2 Kafka sources that are configured with a watermark strategy instantiated like this: WatermarkStrategy.forBoundedOutOfOrderness(maxAllowedWatermarkDrift) .withIdleness(idleTimeout) // 5 seconds currently .withWatermarkAlignment(alignmentGroup, maxAllowedWate

Watermark alignment with different allowed drifts

2024-02-05 Thread Alexis Sarda-Espinosa
Hello, is the behavior for this configuration well defined? Assigning two different (Kafka) sources to the same alignment group but configuring different max allowed drift in each one. Regards, Alexis.

Re: Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Alexis Sarda-Espinosa
Hi Dominik, Sounds like it could be this? https://issues.apache.org/jira/browse/FLINK-28060 It doesn't mention transactions but I'd guess it could be the same mechanism. Regards, Alexis. On Mon, 18 Dec 2023, 07:51 Dominik Wosiński, wrote: > Hey, > I've got a question regarding the transaction

Re: Avoid dynamic classloading in native mode with Kubernetes Operator

2023-11-20 Thread Alexis Sarda-Espinosa
ally like to just let the system do its thing > rather than build a complicated two-jar approach. > > Thanks, > Trystan > > On Fri, Nov 17, 2023 at 12:19 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hi Trystan, >> >> I imagine you can c

Re: Avoid dynamic classloading in native mode with Kubernetes Operator

2023-11-17 Thread Alexis Sarda-Espinosa
Hi Trystan, I imagine you can create 2 jars, one should only have a class with the main method, and the other should be a fat jar with everything else for your job. If you create a custom image where your fat jar is placed under /opt/flink/lib/ then I think it would "just work" when specifying the

Re: dependency error with latest Kafka connector

2023-11-14 Thread Alexis Sarda-Espinosa
Isn't it expected that it points to 1.17? That version of the Kafka connector is meant to be compatible with both Flink 1.17 and 1.18, right? So the older version should be specified so that the consumer can decide which Flink version to compile against, otherwise the build tool could silently upda

Re: Continuous errors with Azure ABFSS

2023-11-10 Thread Alexis Sarda-Espinosa
. Seems like normal operations, so it's just unfortunate the Azure API exposes that as continuous ClientOtherError metrics. Regards, Alexis. Am Fr., 6. Okt. 2023 um 08:10 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Yes, that also works correctly, at least base

Re: Disable flink old checkpoint clean

2023-11-08 Thread Alexis Sarda-Espinosa
Hello, maybe someone can correct me if I'm wrong, but reading through [1], it seems to me that manually triggered checkpoints were meant for these scenarios. If the implementation follows the ticket's description, a user-triggered checkpoint would "break the chain of incremental checkpoints", whic

Re: Updating existing state with state processor API

2023-10-27 Thread Alexis Sarda-Espinosa
> > > > PS: I’m currently working on this ticket in order to get some glitches > removed: FLINK-26585 <https://issues.apache.org/jira/browse/FLINK-26585> > > > > > > *From:* Alexis Sarda-Espinosa > *Sent:* Thursday, October 26, 2023 4:01 PM > *To:* user &

Updating existing state with state processor API

2023-10-26 Thread Alexis Sarda-Espinosa
Hello, The documentation of the state processor API has some examples to modify an existing savepoint by defining a StateBootstrapTransformation. In all cases, the entrypoint is OperatorTransformation#bootstrapWith, which expects a DataStream. If I pass an empty DataStream to bootstrapWith and the

Re: Continuous errors with Azure ABFSS

2023-10-05 Thread Alexis Sarda-Espinosa
, is it able to > safely use the checkpoint and get back to the checkpointed state? > > Regards > Ram, > > On Thu, Sep 28, 2023 at 4:46 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hi Surendra, >> >> there are no exceptions in the l

Re: Continuous errors with Azure ABFSS

2023-09-28 Thread Alexis Sarda-Espinosa
Singh Lilhore < surendralilh...@gmail.com>: > Hi Alexis, > > Could you please check the TaskManager log for any exceptions? > > Thanks > Surendra > > > On Thu, Sep 28, 2023 at 7:06 AM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> He

Re: Continuous errors with Azure ABFSS

2023-09-28 Thread Alexis Sarda-Espinosa
; Regards > Ram > > On Thu, Sep 28, 2023 at 2:06 AM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hello, >> >> We are using ABFSS for RocksDB's backend as well as the storage dir >> required for Kubernetes HA. In the Azure Portal&#x

Continuous errors with Azure ABFSS

2023-09-27 Thread Alexis Sarda-Espinosa
Hello, We are using ABFSS for RocksDB's backend as well as the storage dir required for Kubernetes HA. In the Azure Portal's monitoring insights I see that every single operation contains failing transactions for the GetPathStatus API. Unfortunately I don't see any additional details, but I know t

Re: Side outputs documentation

2023-09-26 Thread Alexis Sarda-Espinosa
ould be independent from each > other and OutputTag's document is correct from this aspect. > > [1] > https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/util/OutputTag.java#L82 > > Best, > Yunfeng > > On Mon, Sep 25, 2023 at 10:57 PM Al

Re: Side outputs documentation

2023-09-25 Thread Alexis Sarda-Espinosa
's current behavior. A ticket and PR could be > added to fix the document. What do you think? > > Best, > Yunfeng > > On Fri, Sep 22, 2023 at 4:55 PM Alexis Sarda-Espinosa > wrote: > > > > Hello, > > > > very quick question, the documentation for side

Side outputs documentation

2023-09-22 Thread Alexis Sarda-Espinosa
Hello, very quick question, the documentation for side outputs states that an OutputTag "needs to be an anonymous inner class, so that we can analyze the type" (this is written in a comment in the example). Is this really true? I've seen many examples where it's a static element and it seems to wo

Re: Failure to restore from last completed checkpoint

2023-09-08 Thread Alexis Sarda-Espinosa
Hello, Just a shot in the dark here, but could it be related to https://issues.apache.org/jira/browse/FLINK-32241 ? Such failures can cause many exceptions, but I think the ones you've included aren't pointing to the root cause, so I'm not sure if that issue applies to you. Regards, Alexis. On

Semantics of purging with global windows

2023-08-30 Thread Alexis Sarda-Espinosa
Hello, According to the javadoc of TriggerResult.PURGE, "All elements in the window are cleared and the window is discarded, without evaluating the window function or emitting any elements." However, I've noticed that using a GlobalWindow (with a custom trigger) followed by an AggregateFunction wi

Re: Question about serialization of java.util classes

2023-08-15 Thread Alexis Sarda-Espinosa
this but unfortunately I have no > idea how to use these classes or how they might be able to help me. This is > all very new to me and I honestly can't wrap my head around Flink's type > information system. > > Best regards, > Saleh. > > On 14 Aug 2023, at 4:0

Re: Question about serialization of java.util classes

2023-08-14 Thread Alexis Sarda-Espinosa
Hello, AFAIK you cannot avoid TypeInformationFactory due to type erasure, nothing Flink can do about that. Here's an example of helper classes I've been using to support set serde in Flink POJOs, but note that it's hardcoded for LinkedHashSet, so you would have to create different implementations

Task manager creation in Flink native Kubernetes (application mode)

2023-07-25 Thread Alexis Sarda-Espinosa
Hi everyone, >From its inception (at least AFAIK), application mode for native Kubernetes has always created "unmanaged" pods for task managers. I would like to know if there are any specific benefits to this, or if on the other hand there are specific reasons not to use Kubernetes Deployments ins

Re: Checkpointing and savepoints can never complete after inconsistency

2023-07-10 Thread Alexis Sarda-Espinosa
I found out someone else reported this and found a workaround: https://issues.apache.org/jira/browse/FLINK-32241 Am Mo., 10. Juli 2023 um 16:45 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hi again, > > I have found out that this issue occurred in 3 different

Re: Checkpointing and savepoints can never complete after inconsistency

2023-07-10 Thread Alexis Sarda-Espinosa
, Alexis. Am Mo., 10. Juli 2023 um 11:07 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hello, > > we have just experienced a weird issue in one of our Flink clusters which > might be difficult to reproduce, but I figured I would document it in case > some of you

Checkpointing and savepoints can never complete after inconsistency

2023-07-10 Thread Alexis Sarda-Espinosa
Hello, we have just experienced a weird issue in one of our Flink clusters which might be difficult to reproduce, but I figured I would document it in case some of you know what could have gone wrong. This cluster had been running with Flink 1.16.1 for a long time and was recently updated to 1.17.

Re: Kafka source with idleness and alignment stops consuming

2023-06-29 Thread Alexis Sarda-Espinosa
Do., 29. Juni 2023 um 10:08 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hi Martjin, > > thanks for the pointers. I think the issue I'm seeing is not caused by > those because in my case the watermarks are not negative. Some more > information from my s

Re: Kafka source with idleness and alignment stops consuming

2023-06-29 Thread Alexis Sarda-Espinosa
owse/FLINK-32414 and > https://issues.apache.org/jira/browse/FLINK-32420 - Could the later be > also applicable in your case? > > Best regards, > > Martijn > > On Wed, Jun 28, 2023 at 11:33 AM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hel

Re: Kafka source with idleness and alignment stops consuming

2023-06-28 Thread Alexis Sarda-Espinosa
eally a bug. > `shouldWaitForAlignment` needs to be another change. > > By the way, a source will be marked as idle, when the source has waiting > for alignment for a long time. Is this a bug? > > > > > > > 在 2023-06-27 23:25:38,"Alexis Sarda-Espinosa" > 写道: > &

Kafka source with idleness and alignment stops consuming

2023-06-27 Thread Alexis Sarda-Espinosa
Hello, I am currently evaluating idleness and alignment with Flink 1.17.1 and the externalized Kafka connector. My job has 3 sources whose watermark strategies are defined like this: WatermarkStrategy.forBoundedOutOfOrderness(maxAllowedWatermarkDrift) .withIdleness(idleTimeout)

Re: Interaction between idling sources and watermark alignment

2023-06-16 Thread Alexis Sarda-Espinosa
ava/org/apache/flink/streaming/api/operators/SourceOperator.java#L659 > [4] > https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/eventtime/WatermarksWithWatermarkAlignment.java#L29 > > > > On 13 Jun 2023, at 21:08, Alexis Sarda-Espinosa &

Re: Interaction between idling sources and watermark alignment

2023-06-13 Thread Alexis Sarda-Espinosa
Hi again, I'm not a fan of bumping questions, but I think this might be relevant, maybe enough to include it in the official documentation? Regards, Alexis. On Tue, 30 May 2023, 16:07 Alexis Sarda-Espinosa, wrote: > Hello, > > I see that, in Flink 1.17.1, watermark alignment wil

Re: RocksDB segfault on state restore

2023-06-01 Thread Alexis Sarda-Espinosa
Hello, A couple of potentially relevant pieces of information: 1. https://issues.apache.org/jira/browse/FLINK-16686 2. https://stackoverflow.com/a/64721838/5793905 (question was about schema evolution, but the answer is more generally applicable) Regards, Alexis. Am Fr., 2. Juni 2023 um 07:18 U

Interaction between idling sources and watermark alignment

2023-05-30 Thread Alexis Sarda-Espinosa
Hello, I see that, in Flink 1.17.1, watermark alignment will be supported (as beta) within a single source's splits and across different sources. I don't see this explicitly mentioned in the documentation, but I assume that the concept of "maximal drift" used for alignment also takes idleness into

Kubernetes operator stops responding due to Connection reset by peer

2023-04-21 Thread Alexis Sarda-Espinosa
Hello, Today, we received an alert because the operator appeared to be down. Upon further investigation, we realized the alert was triggered because the endpoint for Prometheus metrics (which we enabled) stopped responding, so it seems the endpoint used for the liveness probe wasn't affected and t

Requirements for POJO serialization

2023-04-11 Thread Alexis Sarda-Espinosa
Hello, according to the documentation, a POJO must have a no-arg constructor and either public fields or public getters and setters with conventional naming. I recently realized that if I create an explicit TypeInfoFactory that provides Types.POJO and all other required details, the getters and se

Re: [Discussion] - Release major Flink version to support JDK 17 (LTS)

2023-03-30 Thread Alexis Sarda-Espinosa
Hi Martijn, just to be sure, if all state-related classes use a POJO serializer, Kryo will never come into play, right? Given FLINK-16686 [1], I wonder how many users actually have jobs with Kryo and RocksDB, but even if there aren't many, that still leaves those who don't use RocksDB for checkpoi

Re: Watermarks lagging behind events that generate them

2023-03-15 Thread Alexis Sarda-Espinosa
rators such as BoundedOutOfOrderTimestamps, > the timestamp of watermark will be reset to Long.MIN_VALUE if the subtask > is restarted and no event from source is processed. > > Best, > Shammon FY > > On Tue, Mar 14, 2023 at 4:58 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrot

Re: Watermarks lagging behind events that generate them

2023-03-14 Thread Alexis Sarda-Espinosa
e: > > > > Hi Alexis > > > > Do you use both event-time watermark generator and TimerService for > processing time in your job? Maybe you can try using event-time watermark > first. > > > > Best, > > Shammon.FY > > > > On Sat, Mar

Watermarks lagging behind events that generate them

2023-03-10 Thread Alexis Sarda-Espinosa
Hello, I recently ran into a weird issue with a streaming job in Flink 1.16.1. One of my functions (KeyedProcessFunction) has been using processing time timers. I now want to execute the same job based on a historical data dump, so I had to adjust the logic to use event time timers in that case (a

Re: [EXTERNAL] Re: Secure Azure Credential Configuration

2023-03-06 Thread Alexis Sarda-Espinosa
t;> >> >> >> Ivan >> >> >> >> >> >> [1] - >> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azure.AzureException: >> No credentials found for account myblob.blob.core.windows.net in the >> configuration, and its contai

Re: [EXTERNAL] Re: Secure Azure Credential Configuration

2023-03-02 Thread Alexis Sarda-Espinosa
but I considered trying it as > an experiment at one point. > > > > *From: *Alexis Sarda-Espinosa > *Sent: *Thursday, March 2, 2023 2:38 PM > *To: *Ivan Webber > *Cc: *user > *Subject: *[EXTERNAL] Re: Secure Azure Credential Configuration > > > > You d

Re: Secure Azure Credential Configuration

2023-03-02 Thread Alexis Sarda-Espinosa
Hi Ivan, Mercy is always free. Are you using WASB or ABFS? I presume it's the latter, since that's the one that can't use EnvironmentVariableKeyProvider, but just to be sure. Regards, Alexis. On Thu, 2 Mar 2023, 23:07 Ivan Webber via user, wrote: > TLDR: I will buy your coffee if you can help

Re: Fast and slow stream sources for Interval Join

2023-02-27 Thread Alexis Sarda-Espinosa
various topics can align on the partitions of the different > topics. > > [1] > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment-_beta_ > > Best, > Mason > > On Mon, Feb 27, 2023 at 8:11 

Re: Fast and slow stream sources for Interval Join

2023-02-27 Thread Alexis Sarda-Espinosa
Hello, I had this question myself and I've seen it a few times, the answer is always the same, there's currently no official way to handle it without state. Regards, Alexis. On Mon, 27 Feb 2023, 14:09 Remigiusz Janeczek, wrote: > Hi, > > How to handle a case where one of the Kafka topics used

Re: Kubernetes operator's merging strategy for template arrays

2023-02-23 Thread Alexis Sarda-Espinosa
a pending improvement to make this configurable and allow >> merging arrays by "name" attribute. This is generally more practical for >> such cases. >> >> Cheers, >> Gyula >> >> On Thu, Feb 23, 2023 at 1:37 PM Alexis Sarda-Espinosa < >&

Kubernetes operator's merging strategy for template arrays

2023-02-23 Thread Alexis Sarda-Espinosa
Hello, I noticed that if I set environment variables in both spec.podTemplate & spec.jobManager.podTemplate for the same container (flink-maincontainer), the values from the latter selectively overwrite the values from the former. For example, if I define something like this (omitting metadata pro

Re: Calculation of UI's maximum non-heap memory

2023-02-21 Thread Alexis Sarda-Espinosa
; So, the maximum non-heap is 150+142+240 = 532m. > > > Best, > Weihua > > > On Tue, Feb 21, 2023 at 2:33 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hi Weihua, >> >> Thanks for your response, I am familiar with those calculations,

Re: Calculation of UI's maximum non-heap memory

2023-02-20 Thread Alexis Sarda-Espinosa
52980629/runtime-getruntime-maxmemory-calculate-method > > Best, > Weihua > > > On Tue, Feb 21, 2023 at 12:15 AM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hello, >> >> I have configured a job manager with the following settin

Calculation of UI's maximum non-heap memory

2023-02-20 Thread Alexis Sarda-Espinosa
Hello, I have configured a job manager with the following settings (Flink 1.16.1): jobmanager.memory.process.size: 1024m jobmanager.memory.jvm-metaspace.size: 150m jobmanager.memory.off-heap.size: 64m jobmanager.memory.jvm-overhead.min: 168m jobmanager.memory.jvm-overhead.max: 168m jobmanager.mem

Re: Could savepoints contain in-flight data?

2023-02-13 Thread Alexis Sarda-Espinosa
ink-docs-release-1.16/docs/ops/state/checkpointing_under_backpressure/ >> [2] >> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/concepts/stateful-stream-processing/#checkpointing >> >> Alexis Sarda-Espinosa 于2023年2月11日周六 06:00写道: >> >>> Hello,

Could savepoints contain in-flight data?

2023-02-10 Thread Alexis Sarda-Espinosa
Hello, One feature of unaligned checkpoints is that the checkpoint barriers can overtake in-flight data, so the buffers are persisted as part of the state. The documentation for savepoints doesn't mention anything explicitly, so just to be sure, will savepoints always wait for in-flight data to b

Re: Backpressure due to busy sub-tasks

2022-12-16 Thread Alexis Sarda-Espinosa
t; martijnvis...@apache.org>: > Hi, > > Backpressure implies that it's actually a later operator that is busy. So > in this case, that would be your process function that can't handle the > incoming load from your Kafka source. > > Best regards, > > Martijn >

Backpressure due to busy sub-tasks

2022-12-13 Thread Alexis Sarda-Espinosa
Hello, I have a Kafka source (the new one) in Flink 1.15 that's followed by a process function with parallelism=2. Some days, I see long periods of backpressure in the source. During those times, the pool-usage metrics of all tasks stay between 0 and 1%, but the process function appears 100% busy.

Re: Clarification on checkpoint cleanup with RETAIN_ON_CANCELLATION

2022-12-12 Thread Alexis Sarda-Espinosa
deletion? > In this case, the checkpoint will be cleaned and not retained and the > savepoint will remain. So you still could use savepoint to restore. > > On Mon, Dec 5, 2022 at 6:33 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hello,

Re: Deterministic co-results

2022-12-08 Thread Alexis Sarda-Espinosa
Hi Salva, Just to give you further thoughts from another user, I think the "temporal join" semantics are very critical in this use case, and what you implement for that may not easily generalize to other cases. Because of that, I'm not sure if you can really define best practices that apply in gen

Re: Cleanup for high-availability.storageDir

2022-12-08 Thread Alexis Sarda-Espinosa
;s probably not what you want. > > @Gyula: Please correct me if I misunderstood something here. > > I hope that helped. > Matthias > > On Wed, Dec 7, 2022 at 4:19 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> I see, thanks for the details.

Re: Cleanup for high-availability.storageDir

2022-12-07 Thread Alexis Sarda-Espinosa
pdates *without* savepoints"? Are you > referring to replacing the job's business logic without stopping the job? > > On Wed, Dec 7, 2022 at 3:17 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hi Matthias, >> >> Then the explanation

Re: Cleanup for high-availability.storageDir

2022-12-07 Thread Alexis Sarda-Espinosa
- job_name/submittedJobGraphX >> - job_name/submittedJobGraphY >> >> Is it safe to clean these up when the job is in a healthy state? >> >> Regards, >> Alexis. >> >> Am Mo., 5. Dez. 2022 um 20:09 Uhr schrieb Alexis Sarda-Espinosa < >> sarda.es

Re: Cleanup for high-availability.storageDir

2022-12-06 Thread Alexis Sarda-Espinosa
Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hi Gyula, > > that certainly helps, but to set up automatic cleanup (in my case, of > azure blob storage), the ideal option would be to be able to set a simple > policy that deletes blobs that haven't been updated in s

Re: Cleanup for high-availability.storageDir

2022-12-05 Thread Alexis Sarda-Espinosa
in the HA dir that > need to be cleaned up by the user: > > > https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/concepts/overview/#jobresultstore-resource-leak > > > Hope this helps > Gyula > > On Mon, 5 Dec 2022 at 11:56, Alexis Sarda-Espinosa &

Cleanup for high-availability.storageDir

2022-12-05 Thread Alexis Sarda-Espinosa
Hello, I see the number of entries in the directory configured for HA increases over time, particularly in the context of job upgrades in a Kubernetes environment managed by the operator. Would it be safe to assume that any files that haven't been updated in a while can be deleted? Assuming the ch

Clarification on checkpoint cleanup with RETAIN_ON_CANCELLATION

2022-12-05 Thread Alexis Sarda-Espinosa
Hello, I have a doubt about a very particular scenario with this configuration: - Flink HA enabled (Kubernetes). - ExternalizedCheckpointCleanup set to RETAIN_ON_CANCELLATION. - Savepoint restore mode left as default NO_CLAIM. During an upgrade, a stop-job-with-savepoint is triggered, and then t

Re: Savepoint restore mode for the Kubernetes operator

2022-11-29 Thread Alexis Sarda-Espinosa
oint cleanup should not clean the savepoint from >>> the old job which should only be controlled by restore mode. >>> So I think you could also set restore mode according to your needs. >>> >>> >>> On Wed, Nov 16, 2022 at 10:41 PM Alexis Sarda-Espinosa &

Re: Savepoint restore mode for the Kubernetes operator

2022-11-29 Thread Alexis Sarda-Espinosa
> job. > In other words, savepoint cleanup should not clean the savepoint from the > old job which should only be controlled by restore mode. > So I think you could also set restore mode according to your needs. > > > On Wed, Nov 16, 2022 at 10:41 PM Alexis Sarda-Espinosa &l

Kubernetes operator and jobs with last-state upgrades

2022-11-16 Thread Alexis Sarda-Espinosa
Hello, I am doing some tests with the operator and, if I'm not mistaken, using last-state upgrade means that, when something is changed in the CR, no savepoint is taken and the pods are simply terminated. Is that a requirement from Flink HA? I would have thought last-state would still use savepoin

Re: Owner reference with the Kubernetes operator

2022-11-16 Thread Alexis Sarda-Espinosa
at 3:32 PM Alexis Sarda-Espinosa < > sarda.espin...@gmail.com> wrote: > >> Hello, >> >> Is there a particular reason the operator doesn't set owner references >> for the Deployments it creates as a result of a FlinkDeployment CR? This >> makes tracking

Savepoint restore mode for the Kubernetes operator

2022-11-16 Thread Alexis Sarda-Espinosa
Hello, Is there a recommended configuration for the restore mode of jobs managed by the operator? Since the documentation states that the operator keeps a savepoint history to perform cleanup, I imagine restore mode should always be NO_CLAIM, but I just want to confirm. Regards, Alexis.

Owner reference with the Kubernetes operator

2022-11-16 Thread Alexis Sarda-Espinosa
Hello, Is there a particular reason the operator doesn't set owner references for the Deployments it creates as a result of a FlinkDeployment CR? This makes tracking in the Argo CD UI impossible. (To be clear, I mean a reference from the Deployment to the FlinkDeployment). Regards, Alexis.

Broadcast state and job restarts

2022-10-27 Thread Alexis Sarda-Espinosa
Hello, The documentation for broadcast state specifies that it is always kept in memory. My assumptions based on this statement are: 1. If a job restarts in the same Flink cluster (i.e. using a restart strategy), the tasks' attempt number increases and the broadcast state is restored since it's n

Broadcast state restoration for BroadcastProcessFunction

2022-10-14 Thread Alexis Sarda-Espinosa
Hello, I wrote a test for a broadcast function to check how it handles broadcast state during retries [1] (the gist only shows a subset of the test in Kotlin, but it's hopefully understandable). The test will not pass unless my function also implements CheckpointedFunction, although those interfac

Re: Partial broadcast/keyed connected streams

2022-10-11 Thread Alexis Sarda-Espinosa
tate as normal keyedstream. > > > > Best Regards! > > 从 Windows 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>发送 > > > > *发件人: *Alexis Sarda-Espinosa > *发送时间: *2022年10月12日 4:11 > *收件人: *user > *主题: *Partial broadcast/keyed connected streams > > >

  1   2   >