[ANNOUNCE] Flink Jira Bot fully live (& Useful Filters to Work with the Bot)

2021-04-22 Thread Konstantin Knauf
Hi everyone, all of the Jira Bot rules are live now. Particularly in the beginning the Jira Bot will be very active, because the rules apply to a lot of old, stale tickets. So, if you get a huge amount of emails from the Flink Jira Bot right now, this will get better. In any case, the Flink Jira B

Re: Flink streaming job's taskmanager process killed by yarn nodemanager because of exceeding 'PHYSICAL' memory limit

2021-04-22 Thread Matthias Pohl
Hi, I have a few questions about your case: * What is the option you're referring to for the bounded shuffle? That might help to understand what streaming mode solution you're looking for. * What does the job graph look like? Are you assuming that it's due to a shuffling operation? Could you provid

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Vishal Santoshi
I saw https://stackoverflow.com/questions/57334257/the-end-timestamp-of-an-event-time-window-cannot-become-earlier-than-the-current. and this seems to suggest a straight up filter, but I am not sure how does that filter works as in would it factor is the lateness when filtering ? On Thu, Apr 22, 2

Re: Flink Hadoop config on docker-compose

2021-04-22 Thread Flavio Pompermaier
Great! Thanks for the support On Thu, Apr 22, 2021 at 2:57 PM Matthias Pohl wrote: > I think you're right, Flavio. I created FLINK-22414 to cover this. Thanks > for bringing it up. > > Matthias > > [1] https://issues.apache.org/jira/browse/FLINK-22414 > > On Fri, Apr 16, 2021 at 9:32 AM Flavio P

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Vishal Santoshi
Well it was not a solution after all. We now have a session window that is stuck with the same issue albeit after the additional lateness. I had increased the lateness to 2 days and that masked the issue which again reared it's head after the 2 days ;lateness was over ( instead of the 1 day ) befo

Re: Flink streaming job's taskmanager process killed by yarn nodemanager because of exceeding 'PHYSICAL' memory limit

2021-04-22 Thread dhanesh arole
Hi, Questions that @matth...@ververica.com asked are very valid and might provide more leads. But if you haven't already then it's worth trying to use jemalloc / tcmalloc. We had similar problems with slow growth in TM memory resulting in pods getting OOMed by k8s. After switching to jemalloc, th

Re: Flink Statefun Python Batch

2021-04-22 Thread Igal Shilman
Hi Tim, I've created a tiny PoC, let me know if this helps, I can't guarantee tho, that this is how we'll eventually approach this, but it should be somewhere along these lines. https://github.com/igalshilman/flink-statefun/tree/tim Thanks, Igal. On Thu, Apr 22, 2021 at 6:53 AM Timothy Bess w

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Matthias Pohl
Hi Vishal, based on the error message and the behavior you described, introducing a filter for late events is the way to go - just as described in the SO thread you mentioned. Usually, you would collect late events in some kind of side output [1]. I hope that helps. Matthias [1] https://ci.apache

Re: Flink Hadoop config on docker-compose

2021-04-22 Thread Matthias Pohl
I think you're right, Flavio. I created FLINK-22414 to cover this. Thanks for bringing it up. Matthias [1] https://issues.apache.org/jira/browse/FLINK-22414 On Fri, Apr 16, 2021 at 9:32 AM Flavio Pompermaier wrote: > Hi Yang, > isn't this something to fix? If I look at the documentation at [1

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Vishal Santoshi
I can do that, but I am not certain this is the right filter. Can you please validate. That aside I already have the lateness configured for the session window ( the normal withLateNess() ) and this looks like a session window was not collected and still is alive for some reason ( a flink bug ? )

Re: MemoryStateBackend Issue

2021-04-22 Thread Matthias Pohl
Hi Milind, I bet someone else might have a faster answer. But could you provide the logs and config to get a better understanding of what your issue is? In general, the state is maintained even in cases where a TaskManager fails. Best, Matthias On Thu, Apr 22, 2021 at 5:11 AM Milind Vaidya wrote

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Vishal Santoshi
The only thing I can think of is to add the lateness configured to the filter as in here, as in the time on the element + lateness should always be greater then the current WM. As in the current issue is Mon Apr 19 20:46:20 EDT 2021. Window end Wed Apr 21 21:09:02 EDT 2021, WM an event forc

Re: Long to Timestamp(3) Conversion

2021-04-22 Thread Matthias Pohl
Hi Aeden, there are some improvements to time conversions coming up in Flink 1.13. For now, the best solution to overcome this is to provide a user-defined function. Hope, that helps. Best, Matthias On Wed, Apr 21, 2021 at 9:36 PM Aeden Jameson wrote: > I've probably overlooked something simple

Re: Kubernetes Setup - JM as job vs JM as deployment

2021-04-22 Thread Matthias Pohl
Hi Gil, I'm not sure whether I understand you correctly. What do you mean by deploying the job manager as "job" or "deployment"? Are you referring to the different deployment modes, Flink offers [1]? These would be independent of Kubernetes. Or do you wonder what the differences are between the Fli

Question about snapshot file

2021-04-22 Thread Abdullah bin Omar
Hi, (1) what 's the snapshot metadata file (binary) contains ? is it possible to read the snapshot metadata file by using Flink Deserialization? (2) is there any function that can be used to see the previous states on time of operation? Thank you

Re: Debezium CDC | OOM

2021-04-22 Thread Matthias Pohl
Hi Ayush, Which state backend have you configured [1]? Have you considered trying out RocksDB [2]? RocksDB might help with persisting at least keyed state. Best, Matthias [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#choose-the-right-state-backend [2] ht

Re: Question about snapshot file

2021-04-22 Thread Matthias Pohl
Hi Abdullah, the metadata file contains handles to the operator states of the checkpoint [1]. You might want to have a look into the State Processor API [2]. Best, Matthias [1] https://github.com/apache/flink/blob/adaaed426c2e637b8e5ffa3f0d051326038d30aa/flink-runtime/src/main/java/org/apache/fli

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Vishal Santoshi
As in this is essentially doing what lateness *should* have done And I think that is a bug. My code now is . Please look at the allowedLateness on the session window. SingleOutputStreamOperator> filteredKeyedValue = keyedValue .process(new LateEventFilter(this.lateNessInMinutes*60*1000l)).name( "

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Matthias Pohl
You're saying that you used `allowedLateness`/`sideOutputLateData` as described in [1] but without the `LateEventFilter`/`LateEventSideOutput` being added to your pipeline when running into the UnsupportedOperationException issue previously? [1] https://ci.apache.org/projects/flink/flink-docs-rele

Re: [ANNOUNCE] Flink Jira Bot fully live (& Useful Filters to Work with the Bot)

2021-04-22 Thread Matthias Pohl
Thanks for setting this up, Konstantin. +1 On Thu, Apr 22, 2021 at 11:16 AM Konstantin Knauf wrote: > Hi everyone, > > all of the Jira Bot rules are live now. Particularly in the beginning the > Jira Bot will be very active, because the rules apply to a lot of old, > stale tickets. So, if you ge

Multiple jobs in the same Flink project

2021-04-22 Thread Oğuzhan Mangır
We have a flink project with multiple jobs. That means we can submit multiple job with the same jar. But there is a limitation here i think. Because, let's assume; I create a flink project with 3 jobs, and create a single jar then put it to the flink cluster (all of these steps are working on a ci

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Vishal Santoshi
Yes sir. The allowedLateNess and side output always existed. On Thu, Apr 22, 2021 at 11:47 AM Matthias Pohl wrote: > You're saying that you used `allowedLateness`/`sideOutputLateData` as > described in [1] but without the `LateEventFilter`/`LateEventSideOutput` > being added to your pipeline whe

Re: Question about snapshot file

2021-04-22 Thread Abdullah bin Omar
Hi, I have a savepoint or checkpointed file from my task. However, the file is binary. I want to see what the file contains. How is it possible to see what information the file has (or how it is possible to make it human readable?) Thank you On Thu, Apr 22, 2021 at 10:19 AM Matthias Pohl wrote

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Vishal Santoshi
And when I added the filter the Exception was not thrown. So the sequence of events * Increased lateness from 12 ( that was what it was initially running with ) to 24 hours * the pipe ran as desired before it blew up with the Exception * masked the issue by increasing the lateness to 48 hours. *

Re: Multiple jobs in the same Flink project

2021-04-22 Thread Arvid Heise
Hi Oğuzhan, I think you know the answer already: it's easiest to have 1 jar per application. And in most cases, it's easiest to also have 1 repo per application. You can use the same template for all 3 and all future applications without any special cases. My rule of thumb is the following: if th

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Vishal Santoshi
<< Added the Fliter upfront as below, the pipe has no issues. Also metrics show that no data is being pushed through the sideoutput and that data in not pulled from the a simulated sideout ( below ) >> Added the Fliter upfront as below, the pipe has no issues. Also metrics show that no data is

Official flink java client

2021-04-22 Thread gaurav kulkarni
Hi,  Is there any official flink client in java that's available? I came across RestClusterClient, but I am not sure if its official. I can create my own client, but just wanted to check if there is anything official available already that I can leverage.  Thanks,Gaurav | | | | | | | |

Re: [ANNOUNCE] Flink Jira Bot fully live (& Useful Filters to Work with the Bot)

2021-04-22 Thread Xintong Song
Thanks for driving this, Konstantin. Great job~! Thank you~ Xintong Song On Thu, Apr 22, 2021 at 11:57 PM Matthias Pohl wrote: > Thanks for setting this up, Konstantin. +1 > > On Thu, Apr 22, 2021 at 11:16 AM Konstantin Knauf > wrote: > >> Hi everyone, >> >> all of the Jira Bot rules are li

Re: Flink streaming job's taskmanager process killed by yarn nodemanager because of exceeding 'PHYSICAL' memory limit

2021-04-22 Thread 马阳阳
Hi Matthias, We have “solved” the problem by tuning the join. But I still try to answer the questions, hoping this will help. * What is the option you're referring to for the bounded shuffle? That might help to understand what streaming mode solution you're looking for. | taskmanager.netwo

Re: Official flink java client

2021-04-22 Thread Yun Gao
Hi gaurav, Logicall Flink client is bear inside the StreamExecutionEnvironment, and users could use the StreamExecutionEnvironment to execute their jobs. Could you share more about why you want to directly use the client? Best, Yun --Original Mail -- Sende

Re: Debezium CDC | OOM

2021-04-22 Thread Ayush Chauhan
Hi Matthias, I am using RocksDB as a state backend. I think the iceberg sink is not able to propagate back pressure to the source which is resulting in OOM for my CDC pipeline. Please refer to this - https://github.com/apache/iceberg/issues/2504 On Thu, Apr 22, 2021 at 8:44 PM Matthias Pohl wr

Re: Question about snapshot file

2021-04-22 Thread Matthias Pohl
What is it you're trying to achieve in general? The JavaDoc of MetadataV2V3SerializerBase provides a description on the format of the file. Theoretically, you could come up with custom code using the Flink sources to parse the content of the file. But maybe, there's another way to accomplish what y

Re: event-time window cannot become earlier than the current watermark by merging

2021-04-22 Thread Matthias Pohl
To me, it sounds strange. I would have expected it to work with `allowedLateness` and `sideOutput` being defined. I pull in David to have a look at it. Maybe, he has some more insights. I haven't worked that much with lateness, yet. Matthias On Thu, Apr 22, 2021 at 10:57 PM Vishal Santoshi wrote

Re: Flink streaming job's taskmanager process killed by yarn nodemanager because of exceeding 'PHYSICAL' memory limit

2021-04-22 Thread Matthias Pohl
Thanks for sharing these details. Looking into FLINK-14952 [1] (which introduced this option) and the related mailing list thread [2], it feels like your issue is quite similar to what is described in there even though it sounds like this issue is mostly tied to bounded jobs. But I'm not sure what

Re: Flink streaming job's taskmanager process killed by yarn nodemanager because of exceeding 'PHYSICAL' memory limit

2021-04-22 Thread Matthias Pohl
Another few questions: Have you had the chance to monitor/profile the memory usage? What section of the memory was used excessively? Additionally, could @dhanesh arole 's proposal solve your issue? Matthias On Fri, Apr 23, 2021 at 8:41 AM Matthias Pohl wrote: > Thanks for sharing these details.