Re: Is there a way to list the operators in a savepoints?

2025-04-11 Thread Gabor Somogyi
t state for operator b7034a3a871df55ba4407d4f859fa00b to > the new program, because the operator is not available in the new program. > > I understand that means there is an operator in our savepoint that is not > in the application, but I cannot think of any new one that would > have

Is there a way to list the operators in a savepoints?

2025-04-11 Thread Jean-Marc Paulin
e new program. I understand that means there is an operator in our savepoint that is not in the application, but I cannot think of any new one that would have changed. Is there an API to list all the operators that are in the savepoints? Maybe that would help me to identify the culprit. Thanks JM

Re: Does Flink serialize events between all operators?

2025-03-11 Thread Alexey Novakov via user
Hi Vadim, Yes, it does serialize objects between operators even if they run within the same Task Manager unless object-reuse configuration is on: https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#pipeline-object-reuse Using immutable data (which is one of the principal

Does Flink serialize events between all operators?

2025-03-11 Thread Vararu, Vadim via user
Hello, Does Flink serialize all the data when moving from one operator to another (even when there is no shuffling/hashing between them)? If yes, then, does it worth to have less operators doing more stuff instead of more granular operators? For instance, one flat map + one filter could be

No operators defined in streaming topology. Cannot execute.

2024-11-18 Thread Nadia Mujeeb
So my task is to set up a Apache flink in my Linux Ubuntu wherein I should be having two data bases as postgress and MySQL . The 2 data bases should be connected in such a way that any change or update in my postgres database should I immediately reflect in my SQL database. But this is error I'm e

Re: No operators defined in streaming topology. Cannot execute.

2024-11-18 Thread Feng Jin
Hi Nadia You don't need to call `Env.execute("Flink CDC Job");` because the job will be submitted after executing `Result.executeInsert("mysql_table");`. Best, Feng On Mon, Nov 18, 2024 at 12:05 PM Nadia Mujeeb wrote: > > So my task is to set up a Apache flink in my Linux Ubuntu wherein I sh

Re: Re:Backpressure causing operators to stop ingestion completely

2024-10-23 Thread Raihan Sunny via user
Re:Backpressure causing operators to stop ingestion completely Hi, you can use flame_graphs to debug https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/debugging/flame_graphs/ -- Original -- From: "Raihan Sunny" ; Date: Tue, Oct 22

Re:Backpressure causing operators to stop ingestion completely

2024-10-22 Thread Jake.zhang
Hi,  you can use flame_graphs to debug https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/ops/debugging/flame_graphs/ -- Original -- From:

Backpressure causing operators to stop ingestion completely

2024-10-21 Thread Raihan Sunny via user
Hello, I have an aggregator job that experiences backpressure after running for a while and completely stops processing. It doesn't take any further input from the source. Here's a bit of context: - There are 3 producer jobs, all of which write data to a common Kafka topic - The aggregator job r

RE: Can we share states across tasks/operators

2024-08-16 Thread Schwalbe Matthias
: TimestampedCollector[O] = null @transient var internalTimerService: InternalTimerService[VoidNamespace] = null override def open(): Unit = { super.open() require(configuration != null, "Missing configuration") //setup collector and timer services (modelled after other

AW: Can we share states across tasks/operators

2024-08-09 Thread Christian Lorenz via user
regards, Christian Von: Schwalbe Matthias Datum: Mittwoch, 7. August 2024 um 12:54 An: Sachin Mittal , user@flink.apache.org Betreff: RE: Can we share states across tasks/operators This email has reached Mapp via an external source Hi Sachin, Just as an idea, while you cannot easily share state

Re: Can we share states across tasks/operators

2024-08-07 Thread Sachin Mittal
, 2024 at 4:23 PM Schwalbe Matthias < matthias.schwa...@viseca.ch> wrote: > Hi Sachin, > > > > Just as an idea, while you cannot easily share state across operators, you > can do so within the same operator: > >- For two such input streams you could connect

RE: Can we share states across tasks/operators

2024-08-07 Thread Schwalbe Matthias
Hi Sachin, Just as an idea, while you cannot easily share state across operators, you can do so within the same operator: * For two such input streams you could connect() the two streams into a ConnectedStreams and then process() by means of a KeyedCoProcessFunction * For more than two

Can we share states across tasks/operators

2024-08-07 Thread Sachin Mittal
Hi, I have a stream which starts from a source and is keyed by a field f. With the stream process function, I can emit the processed record downstream and also update state based on the records it received for the same key. Now I have another stream which starts from another source and is of the s

Re: How to list operators and see UID

2024-04-03 Thread Asimansu Bera
"read-bytes-complete": true, "write-bytes": 0, "write-bytes-complete": true, "read-records": 7207, "read-records-complete": true, "write-records": 0, "write-records-comple

How to list operators and see UID

2024-04-03 Thread Oscar Perez via user
Hei, We are facing an issue with one of the jobs in production where fails to map state from one deployment to another. I guess the problem is that we failed to set a UID and relies on the default of providing one based on hash Is it possible to see all operators / UIDs at a glance? What is the

Re: Question about time-based operators with RocksDB backend

2024-03-06 Thread xia rui
nalTimerService internalTimerService" refers to the timer state. Best regards Rui Xia On Mon, Mar 4, 2024 at 7:39 PM Gabriele Mencagli < gabriele.menca...@gmail.com> wrote: > Dear Flink Community, > > I am using Flink with the DataStream API and operators implemented using > RichedFunction

Re: Question about time-based operators with RocksDB backend

2024-03-05 Thread Jinzhong Li
Hi Gabriele, The keyed state APIs (ValueState、ListState、etc) are supported by all types of state backend (hashmap、rocksdb、etc.). And the built-in window operators are implemented with these state APIs internally. So you can use these built-in operators/functions with the RocksDB state backend

Re: Question about time-based operators with RocksDB backend

2024-03-04 Thread Zakelly Lan
Hi Gabriele, Quick answer: You can use the built-in window operators which have been integrated with state backends including RocksDB. Thanks, Zakelly On Tue, Mar 5, 2024 at 10:33 AM Zhanghao Chen wrote: > Hi Gabriele, > > I'd recommend extending the existing window fun

Re: Question about time-based operators with RocksDB backend

2024-03-04 Thread Zhanghao Chen
an be satisfied with the reduce/aggregate function pattern, which is important for large windows. Best, Zhanghao Chen From: Gabriele Mencagli Sent: Monday, March 4, 2024 19:38 To: user@flink.apache.org Subject: Question about time-based operators with RocksDB ba

Question about time-based operators with RocksDB backend

2024-03-04 Thread Gabriele Mencagli
Dear Flink Community, I am using Flink with the DataStream API and operators implemented using RichedFunctions. I know that Flink provides a set of window-based operators with time-based semantics and tumbling/sliding windows. By reading the Flink documentation, I understand that there is

Re: Zookeeper HA with Kubernetes: Possible to use the same Zookeeper cluster w/multiple Flink Operators?

2023-09-21 Thread Brian King
Gyula, Thanks, you've helped us move closer to migrating our application to the Flink Operator![0] Best, Brian King SRE, Data Platform/Search Platform Wikimedia Foundation IRC: inflatador [0] https://phabricator.wikimedia.org/T326409 > On Sep 21, 2023, at 12:16 AM, Gyula Fóra wrote: > >

Re: Zookeeper HA with Kubernetes: Possible to use the same Zookeeper cluster w/multiple Flink Operators?

2023-09-20 Thread Gyula Fóra
Hi! The cluster-id for each FlinkDeployment is simply the name of the deployment. So they are all different in a given namespace. (In other words they are not fixed as your question suggests but set automatically) So there should be no problem sharing the ZK cluster . Cheers Gyula On Thu, 21 Se

Zookeeper HA with Kubernetes: Possible to use the same Zookeeper cluster w/multiple Flink Operators?

2023-09-20 Thread Brian King
Hello Flink Users! We're attempting to deploy a Flink application cluster on Kubernetes, using the Flink Operator and Zookeeper for HA. We're using Flink 1.16 and I have a question about some of the Zookeeper configuration[0]: "high-availability.zookeeper.path.root" is described as "The root Z

RE: Is it possible to preserve chaining for multi-input operators?

2023-03-24 Thread Schwalbe Matthias
Hi Viacheslav, … back from vacation … you are welcome, glad to hear it worked out 😊 Thias From: Viacheslav Chernyshev Sent: Thursday, March 16, 2023 5:34 PM To: user@flink.apache.org Subject: Re: Is it possible to preserve chaining for multi-input operators? Hi Matthias, Just wanted to

Re: Is there a way to control the parallelism of auto-generated Flink operators of the FlinkSQL job graph?

2023-03-24 Thread Hang Ruan
operator. After all we don't know what operators will be contained when we write the sql. Best, Hang Elkhan Dadashov 于2023年3月24日周五 14:14写道: > Checking with the community again, if anyone explored this before. > > Thanks. > > > On Fri, Mar 17, 2023 at 1:56 PM Elk

Re: Is it possible to preserve chaining for multi-input operators?

2023-03-16 Thread Viacheslav Chernyshev
been facing before. Kind regards, Viacheslav From: Schwalbe Matthias Sent: 28 February 2023 15:50 To: Viacheslav Chernyshev ; user@flink.apache.org Subject: RE: Is it possible to preserve chaining for multi-input operators? Hi Viacheslav, Certainly

RE: Is it possible to preserve chaining for multi-input operators?

2023-02-28 Thread Schwalbe Matthias
Flink-speak) for your operator * Windowing can be implemented manually and modelled after the official Flink windowing operators * Should you absolutely need more than one windowing namespace, then you need to become creative with state primitives * You mentioned also broadcast

Re: Is it possible to preserve chaining for multi-input operators?

2023-02-28 Thread Viacheslav Chernyshev
t: RE: Is it possible to preserve chaining for multi-input operators? Hi Viacheslav, These are two very interesting questions… You have found out about the chaining restriction to single input operators to be chained, it does also not help to union() multiple streams into a single input,

RE: Is it possible to preserve chaining for multi-input operators?

2023-02-28 Thread Schwalbe Matthias
Hi Viacheslav, These are two very interesting questions... You have found out about the chaining restriction to single input operators to be chained, it does also not help to union() multiple streams into a single input, they still count as multiple inputs. * The harder way to go would

Is it possible to preserve chaining for multi-input operators?

2023-02-28 Thread Viacheslav Chernyshev
Hi everyone, My team is developing a streaming pipeline for analytics on top of market data. The ultimate goal is to be able to handle tens of millions of events per second distributed across the cluster according to the unique ID of a particular financial instrument. Unfortunately, we struggle

Re: Unbalanced distribution of keyed stream to downstream parallel operators

2022-04-07 Thread Isidoros Ioannou
operators. The slot assignment of the KeyGroup is done by flink. You mean key by again by a different property so that the previous aggregate events get reassigned again to operators. I apologize if my question is naive but I got a little confused. Στις Δευ 4 Απρ 2022 στις 10:38 π.μ., ο/η Arvid

Re: The flink-kubernetes-operator vs third party flink operators

2022-04-05 Thread Yang Wang
nee, then it’s probably a good candidate to work on? > > > > *From: *Gyula Fóra > *Date: *Saturday, April 2, 2022 at 2:19 AM > *To: *Hao t Chang > *Cc: *"user@flink.apache.org" > *Subject: *[EXTERNAL] Re: The flink-kubernetes-operator vs third party > f

RE: The flink-kubernetes-operator vs third party flink operators

2022-04-05 Thread Hao t Chang
signee, then it’s probably a good candidate to work on? From: Gyula Fóra Date: Saturday, April 2, 2022 at 2:19 AM To: Hao t Chang Cc: "user@flink.apache.org" Subject: [EXTERNAL] Re: The flink-kubernetes-operator vs third party flink operators Hi! The main difference at the moment is

Re: Unbalanced distribution of keyed stream to downstream parallel operators

2022-04-04 Thread Arvid Heise
3.2 that consists of a kafka >> source with one partition so far >> > then we filter the data based on some conditions, mapped to POJOS and >> we transform to a KeyedStream based on an accountId long property from the >> POJO. The downstream operators are 10 CEP operators that ru

Re: Unbalanced distribution of keyed stream to downstream parallel operators

2022-04-04 Thread Isidoros Ioannou
ta based on some conditions, mapped to POJOS and we > transform to a KeyedStream based on an accountId long property from the > POJO. The downstream operators are 10 CEP operators that run with > parallelism of 14 and the maxParallelism is set to the (operatorParallelism > * operatorParall

Re: The flink-kubernetes-operator vs third party flink operators

2022-04-02 Thread Gyula Fóra
Hi! The main difference at the moment is the programming language and the APIs used to interact with Flink. The flink-kubernetes-operator, uses Java and interacts with Flink using the built in (native) clients. The other operators have been around since earlier Flink versions. They all use

The flink-kubernetes-operator vs third party flink operators

2022-04-02 Thread Hao t Chang
Hi I started looking into Flink recently more specifically the flink-kubernetes-operator so I only know little about it. I found at least 3 other Flink K8s operators that Lyft, Google, and Spotify developed. Could someone please enlighten me what is the difference of these third party Flink

Re: Unbalanced distribution of keyed stream to downstream parallel operators

2022-04-01 Thread Qingsheng Ren
> > we ran a flink application version 1.13.2 that consists of a kafka source > with one partition so far > then we filter the data based on some conditions, mapped to POJOS and we > transform to a KeyedStream based on an accountId long property from the POJO. > The downstream ope

Unbalanced distribution of keyed stream to downstream parallel operators

2022-04-01 Thread Isidoros Ioannou
Hello, we ran a flink application version 1.13.2 that consists of a kafka source with one partition so far then we filter the data based on some conditions, mapped to POJOS and we transform to a KeyedStream based on an accountId long property from the POJO. The downstream operators are 10 CEP

Re: "No operators defined in streaming topology" error when Flink app still starts successfully

2022-02-15 Thread Chesnay Schepler
). String executionPlan = env.getExecutionPlan(); env.execute(); logger.info("Started job; executionPlan={}", getExecutionPlan); On 14/02/2022 17:40, Shane Bishop wrote: Hi all, My team has started seeing the error "java.lang.IllegalStateException: No operators defined in s

"No operators defined in streaming topology" error when Flink app still starts successfully

2022-02-14 Thread Shane Bishop
Hi all, My team has started seeing the error "java.lang.IllegalStateException: No operators defined in streaming topology. Cannot execute." However, even with this error, the Flink application starts and runs fine, and the Flink job renders fine in the Flink Dashboard. Attached i

Re: pyflink mixed with Java operators

2022-01-10 Thread Francis Conroy
Thanks for this Dian. I'll give that a try. On Mon, 10 Jan 2022 at 22:51, Dian Fu wrote: > Hi, > > You could try the following method: > > ``` > from pyflink.java_gateway import get_gateway > > jvm = get_gateway().jvm > ds = ( > DataStream(ds._j_data_stream.map(jvm.com.example.MyJavaMapFunct

Re: pyflink mixed with Java operators

2022-01-10 Thread Dian Fu
Hi, You could try the following method: ``` from pyflink.java_gateway import get_gateway jvm = get_gateway().jvm ds = ( DataStream(ds._j_data_stream.map(jvm.com.example.MyJavaMapFunction())) ) ``` Regards, Dian On Fri, Jan 7, 2022 at 1:00 PM Francis Conroy wrote: > Hi all, > > Does anyon

Creating DS in low level conversion operators i.e ProcessFunctions

2022-01-08 Thread Siddhesh Kalgaonkar
Hey Team, I have a flow like Kafka Sink Datastream -> Process Function (Separate Class) -> DBSink(Separate Class). Process Function returns me the output as a string and now I want to create a DataStream out of the string variable so that I can call something like ds.addSink(new DBSink()). For th

pyflink mixed with Java operators

2022-01-06 Thread Francis Conroy
Hi all, Does anyone know if it's possible to specify a java map function at some intermediate point in a pyflink job? In this case SimpleCountMeasurementsPerUUID is a flink java MapFunction. The reason we want to do this is that performance in pyflink seems quite poor. e.g. import logging impor

Re: How to write unit test for stateful operators in Pyflink apps

2021-11-07 Thread Long Nguyễn
Hi Dian, I got it. A few days ago, I also found some test cases implemented with Python here in Flink's official repository. I took a look at them and it seems like many internal functions are used and since those are

Re: How to write unit test for stateful operators in Pyflink apps

2021-11-07 Thread Dian Fu
Hi Long, I agree with Fabian that currently you have to test it with a e2e job. There are still no such test harnesses for PyFlink jobs. Regards, Dian On Fri, Nov 5, 2021 at 5:22 PM Long Nguyễn wrote: > Thanks, Fabian. I'll check it out. > > Hope that Dian can also give me some advice. > > Bes

Re: How to write unit test for stateful operators in Pyflink apps

2021-11-05 Thread Long Nguyễn
Thanks, Fabian. I'll check it out. Hope that Dian can also give me some advice. Best, Long On Fri, Nov 5, 2021 at 3:48 PM Fabian Paul wrote: > Hi, > > Since you want to use Table API you probably can write a more high-level > test around executing the complete program. A good examples are the

Re: How to write unit test for stateful operators in Pyflink apps

2021-11-05 Thread Fabian Paul
Hi, Since you want to use Table API you probably can write a more high-level test around executing the complete program. A good examples are the pyflink example programs [1]. I also could not find something similar to the testing harness from Java. I cced Dian maybe he knows more about it. Be

How to write unit test for stateful operators in Pyflink apps

2021-11-04 Thread Long Nguyễn
Hi. I'm using Pyflink and Table APIs to create a window with Python. I have just read the Unit Testing Stateful or Timely UDFs & Custom Operators <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/testing/#unit-testing-stateful-or-timely-udfs--cust

RE: Event is taking a lot of time between the operators

2021-09-29 Thread Sanket Agrawal
Thank you @Piotr Nowojski<mailto:pnowoj...@apache.org> for helping me. From: Piotr Nowojski Sent: Wednesday, September 29, 2021 12:53 PM To: Sanket Agrawal Cc: Ragini Manjaiah ; user@flink.apache.org Subject: Re: Event is taking a lot of time between the operators [**EXTERNAL EMAIL

Re: Event is taking a lot of time between the operators

2021-09-29 Thread Piotr Nowojski
ternal queue by adjusting the `capacity` parameter [2] The more buffered in-flight data you have between operators, the longer the delay between processing the same record by two different operators. Best, Piotrek [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/net

RE: Event is taking a lot of time between the operators

2021-09-28 Thread Sanket Agrawal
ot of time between the operators [**EXTERNAL EMAIL**] Hi Sanket, I have a similar use case. how are you measuring the time for Async1` function to return the result and external api call On Wed, Sep 29, 2021 at 10:47 AM Sanket Agrawal mailto:sanket.agra...@infosys.com>> wrote: Hi

Re: Event is taking a lot of time between the operators

2021-09-28 Thread Ragini Manjaiah
tly) is available for the next message as soon as it > calls CompletableFuture.supplyAsync on the current message. > > > > Thanks, > > Sanket Agrawal > > > > *From:* Piotr Nowojski > *Sent:* Tuesday, September 28, 2021 8:02 PM > *To:* Sanket Agrawal

RE: Event is taking a lot of time between the operators

2021-09-28 Thread Sanket Agrawal
Event is taking a lot of time between the operators [**EXTERNAL EMAIL**] Hi, With Flink 1.8.0 I'm not sure how reliable the backpressure status is in the WebUI when it comes to the Async operators. If I remember correctly until around Flink 1.10 (+/- 2 version) backpressure monitoring w

Re: Event is taking a lot of time between the operators

2021-09-28 Thread Piotr Nowojski
Hi, With Flink 1.8.0 I'm not sure how reliable the backpressure status is in the WebUI when it comes to the Async operators. If I remember correctly until around Flink 1.10 (+/- 2 version) backpressure monitoring was checking for thread dumps stuck in requesting Flink's network memory b

Event is taking a lot of time between the operators

2021-09-28 Thread Sanket Agrawal
Hi All, I am new to Flink. While developing a Flink application We observed that our message is taking around 10 seconds between the two Async operators. Below are the details. * Flink Flow: Kinesis Source -> Process -> Async1 -> Async2 -> Process -> Kinesis Sink

Re: Flink performance with multiple operators reshuffling data

2021-08-31 Thread JING ZHANG
Hi Jason, > In our case, our input/output ratio of these Flin operators are all 1 to 1, so I guess it doesn't matter that much.. Yes > But I think the keys we are using in general are pretty uniform. Cool. You could run for a period of time to see if there is data skew. If there is in

Re: Flink performance with multiple operators reshuffling data

2021-08-31 Thread Jason Liu
case, our input/output ratio of these Flin operators are all 1 to 1, so I guess it doesn't matter that much..? It's good to know Flink would be scalable in this situation. -Jason

Re: Flink performance with multiple operators reshuffling data

2021-08-30 Thread JING ZHANG
function with the lowest selectivity at the top. The lower the ratio of output records number to input records number, the lower the selectivity is. Best, JING ZHANG Jason Liu 于2021年8月31日周二 上午8:12写道: > Hi there, > > We have this use case where we need to have multiple keybys

Re: Flink performance with multiple operators reshuffling data

2021-08-30 Thread Caizhi Weng
午8:12写道: > Hi there, > > We have this use case where we need to have multiple keybys operators > with its own MapState, all with different keys, in a single Flink app. This > obviously means we'll be reshuffling our data a lot. > Our TPS is around 1-2k, with ~2

Flink performance with multiple operators reshuffling data

2021-08-30 Thread Jason Liu
Hi there, We have this use case where we need to have multiple keybys operators with its own MapState, all with different keys, in a single Flink app. This obviously means we'll be reshuffling our data a lot. Our TPS is around 1-2k, with ~2kb per event and we use Kinesis Data Analyti

Re: Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-07 Thread Arvid Heise
Hi Jiahui, changing the job graph is what I meant with application upgrade. There is no difference between checkpoint and savepoint afaik. New operators would be initialized with empty state - so correct for stateless operators. So it should work for all sketched cases with both savepoints and

Re: Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-07 Thread Jiahui Jiang
Hello Arvid, how about no upgrade, just changing the job graph by having different stateless operators? Will checkpoint be sufficient? The specific example use case is - we have some infrastructure that orchestrates and runs user SQL queries. Sometimes in between runs users might have changed

Re: Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-07 Thread Arvid Heise
gt; What are the behavior differences when recovering from a checkpoint vs. a > savepoint? If the job graph changes between runs, but all the stateful > operators are guaranteed to have their UID fixed. Will a pipeline be able > to restore from the retained checkpoint if incre

Re: Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-06 Thread Jiahui Jiang
7;s a savepoint or checkpoint, What are the behavior differences when recovering from a checkpoint vs. a savepoint? If the job graph changes between runs, but all the stateful operators are guaranteed to have their UID fixed. Will a pipeline be able to restore from the retained checkpoint if increm

Re: Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-02 Thread Roman Khachatryan
state has the corresponding ID in the snapshot (though technically the snapshot for the chain is sent as a single object to the JM). Probably some intermediate operators have state. How do you verify that they don't? Exception message could probably help to identify the problematic operators. Re

Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-02 Thread Jiahui Jiang
Hello Flink, I'm trying to understand the state recovery mechanism when there are extra stateless operators. I'm using flink-sql, and I tested a 'select `number_col` from source' query, where the stream graph looks like: `source (stateful with fixed uid) -> [seve

Re: Flink in k8s operators list

2021-05-31 Thread Svend
Hi Ilya, Thanks for the kind feed-back. We hit the first issue you mention related to K8s 1.18+, we then updated the controller-gen version to 0.2.4 in the makefile as described in the ticket you linked, and then ran "make deploy", which worked around the issue for us. I'm not aware of the 2nd

Re: Flink in k8s operators list

2021-05-31 Thread Ilya Karpov
Hi Svend, thank you so much to sharing your experience! GCP k8s operator looks promising (currently i’m trying to build it and run helm chart. An issue with k8s version 1.18+ is road block right now, but I see that there

Re: Flink in k8s operators list

2021-05-29 Thread Svend
Hi Ilya, At my company we're currently using the GCP k8s operator (2nd on your list). Our usage is very moderate, but so far it works great for us. We appreciate that when upgrading the application, it triggers automatically a savepoint during shutdown and resumes from it when restarting. It al

Re: Flink in k8s operators list

2021-05-28 Thread Yuval Itzchakov
https://github.com/lyft/flinkk8soperator On Fri, May 28, 2021, 10:09 Ilya Karpov wrote: > Hi there, > > I’m making a little research about the easiest way to deploy link job to > k8s cluster and manage its lifecycle by *k8s operator*. The list of > solutions is below: > - https://github.com/fint

Flink in k8s operators list

2021-05-28 Thread Ilya Karpov
Hi there, I’m making a little research about the easiest way to deploy link job to k8s cluster and manage its lifecycle by k8s operator. The list of solutions is below: - https://github.com/fintechstudios/ververica-platform-k8s-operator - https://github.com/GoogleCloudPlatform/flink-on-k8s-opera

Re: Long checkpoint duration for Kafka source operators

2021-05-20 Thread Hubert Chen
experiencing long end to end > checkpoint durations for the Kafka source operators. I'm hoping I could get > some direction in how to debug this further. > > Here is a UI screenshot of a checkpoint instance: > > [image: checkpoint.png] > > My goal is to bring the total c

Long checkpoint duration for Kafka source operators

2021-05-13 Thread Hubert Chen
Hello, I have an application that reads from two Kafka sources, joins them, and produces to a Kafka sink. The application is experiencing long end to end checkpoint durations for the Kafka source operators. I'm hoping I could get some direction in how to debug this further. Here is

Re: Flink Checkpoint for Stateless Operators

2021-04-29 Thread Roman Khachatryan
I'm not sure I understand the use case. What do you do with the results >> of Flat Map? >> >> [1 https://arxiv.org/pdf/1506.08603.pdf >> [2] >> https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html >> >> Regards, >> R

Re: Flink Checkpoint for Stateless Operators

2021-04-29 Thread Raghavendar T S
che-flink.html > > Regards, > Roman > > > On Thu, Apr 29, 2021 at 3:16 PM Raghavendar T S > wrote: > >> Hi Roman >> >> In general, how Flink tracks the events from source to downstream >> operators? We usually emit existing events from an operator or crea

Re: Flink Checkpoint for Stateless Operators

2021-04-29 Thread Roman Khachatryan
tly-once-apache-flink.html Regards, Roman On Thu, Apr 29, 2021 at 3:16 PM Raghavendar T S wrote: > Hi Roman > > In general, how Flink tracks the events from source to downstream > operators? We usually emit existing events from an operator or create a new > instance of a class and

Re: Flink Checkpoint for Stateless Operators

2021-04-29 Thread Raghavendar T S
Hi Roman In general, how Flink tracks the events from source to downstream operators? We usually emit existing events from an operator or create a new instance of a class and emit it. How does Flink or Flink source know whether the events are which snapshot? > So you don't need to re-pr

Re: Flink Checkpoint for Stateless Operators

2021-04-29 Thread Roman Khachatryan
Hi Raghavendar, In Flink, checkpoints are global, meaning that a checkpoint is successful only if all operators acknowledge it. So the offset will be stored in state and then committed to Kafka [1] only after all the tasks acknowledge that checkpoint. At that moment, the element must be either

Flink Checkpoint for Stateless Operators

2021-04-29 Thread Raghavendar T S
ade a successful checkpoint. In this case, does the offset of event 1 will be part of the checkpoint? Will Flink track the event from source to all downstream operators? If this is a true case and If the processing of the event is failed (any third party API/DB failure) in the Flat Map after a succe

Re: Source Operators Stuck in the requestBufferBuilderBlocking

2021-04-13 Thread Arvid Heise
issue so far. Could you > please describe in more detail the job graph, in particular what are > the downstream operators and whether there is any chaining? > > Do I understand correctly, that Flink returned back to normal at > around 8:00; worked fine for ~3 hours; got stuck

Re: Source Operators Stuck in the requestBufferBuilderBlocking

2021-04-06 Thread Roman Khachatryan
Hi Sihan, Unfortunately, we are unable to reproduce the issue so far. Could you please describe in more detail the job graph, in particular what are the downstream operators and whether there is any chaining? Do I understand correctly, that Flink returned back to normal at around 8:00; worked

Re: Source Operators Stuck in the requestBufferBuilderBlocking

2021-03-30 Thread Sihan You
whole job 5? How is the > > > > > > > > > topology roughly looking? (e.g., Source -> Map -> Sink?) > > > > > > > > > 5. Did you see any warns/errors in the logs related to > > > > > > > > > checkpointing and I/O? > >

Re: Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-29 Thread David Anderson
Yes, since the two streams have the same type, you can union the two streams, key the resulting stream, and then apply something like a RichFlatMapFunction. Or you can connect the two streams (again, they'll need to be keyed so you can use state), and apply a RichCoFlatMapFunction. You can use whic

Re: Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-29 Thread vishalovercome
I've gone through the example as well as the documentation and I still couldn't understand whether my use case requires joining. 1. What would happen if I didn't join?2. As the 2 incoming data streams have the same type, if joining is absolutely necessary then just a union (oneStream.union(anotherS

Re: Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-24 Thread vishalovercome
Let me make the example more concrete. Say O1 gets as input a data stream T1 which it splits into two using some function and produces DataStreams of type T2 and T3, each of which are partitioned by the same key function TK. Now after O2 processes a stream, it could sometimes send the stream to O3

Re: Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-24 Thread David Anderson
For an example of a similar join implemented as a RichCoFlatMap, see [1]. For more background, the Flink docs have a tutorial [2] on how to work with connected streams. [1] https://github.com/apache/flink-training/tree/master/rides-and-fares [2] https://ci.apache.org/projects/flink/flink-docs-stab

Re: Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-24 Thread Matthias Pohl
1. yes - the same key would affect the same state variable 2. you need a join to have the same operator process both streams Matthias On Wed, Mar 24, 2021 at 7:29 AM vishalovercome wrote: > Let me make the example more concrete. Say O1 gets as input a data stream > T1 > which it splits into two

Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-23 Thread vishalovercome
Suppose i have a job with 3 operators with the following job graph: O1 => O2 // data stream partitioned by keyBy O1 => O3 // data stream partitioned by keyBy O2 => O3 // data stream partitioned by keyBy If operator O3 receives inputs from two operators and both inputs have the same

Re: Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-23 Thread Matthias Pohl
Hi Vishal, I'm not 100% sure what you're trying to do. But the partitioning by a key just relies on the key on the used parallelism. So, I guess, what you propose should work. You would have to rely on some join function, though, when merging two input operators into one again. I hop

Re: Approaches to customize the parallelism in SQL generated operators

2021-03-22 Thread eef hhj
tream operators can well handle the traffic during cold start. Per our observation, it works, not sure if this is the suggested way for that. Any other suggestion is appreciated. Another direclty we want to explore is only to change parallelism of the source consumer, but not the subsequent ones, any fu

Re: Approaches to customize the parallelism in SQL generated operators

2021-03-20 Thread David Anderson
No, there is no mechanism available for individually tuning the parallelism of the generated operators in a SQL job. Moreover, such fine-tuning is often counter-productive. In most cases you are better off simply setting the overall parallelism to whatever is needed by the busiest operator(s

Approaches to customize the parallelism in SQL generated operators

2021-03-18 Thread eef hhj
. We want to know that if there is any way to customize the parallelism of the SQL generated operators individually so that we can make their powers match with their actual load to make operators' load evenly distributed. Except to customize the parallelism of the operators, is there any

Re: Broadcasting to multiple operators

2021-03-05 Thread David Anderson
; } >>> } >>> >>> >>> By setting the watermark for this stream to MAX_WATERMARK, you are >>> effectively removing this stream's watermarks from consideration. >>> >>> Regards, >>> David >>

Re: Broadcasting to multiple operators

2021-03-05 Thread Roger
t;> { >> return 0; >> } >> } >> >> >> By setting the watermark for this stream to MAX_WATERMARK, you are >> effectively removing this stream's watermarks from consideration. >> >> Regards, >> David >> >>

Re: Broadcasting to multiple operators

2021-03-05 Thread Roger
gt; Regards, > David > > On Fri, Mar 5, 2021 at 5:48 PM Roger wrote: > >> Hello. >> I am having an issue with a Flink 1.8 pipeline when trying to consume >> broadcast state across multiple operators. I currently >> have a working pipeline that looks like

Re: Broadcasting to multiple operators

2021-03-05 Thread David Anderson
ively removing this stream's watermarks from consideration. Regards, David On Fri, Mar 5, 2021 at 5:48 PM Roger wrote: > Hello. > I am having an issue with a Flink 1.8 pipeline when trying to consume > broadcast state across multiple operators. I currently > have a working pi

Broadcasting to multiple operators

2021-03-05 Thread Roger
Hello. I am having an issue with a Flink 1.8 pipeline when trying to consume broadcast state across multiple operators. I currently have a working pipeline that looks like the following: records .assignTimestampsAndWatermarks( new BoundedOutOfOrdernessGenerator( Long.parseLong

  1   2   3   4   >