Re: [DISCUSS] Watermark propagation with Sink API

2021-07-18 Thread Arvid Heise
Hi everyone, I created a FLIP and started a discussion around that topic [1]. Best, Arvid [1] https://lists.apache.org/thread.html/r8357d64b9cfdf5a233c53a20d9ac62b75c07c925ce2c43e162f1e39c%40%3Cdev.flink.apache.org%3E On Tue, Jun 8, 2021 at 11:46 PM Eron Wright wrote: > Thanks, the narrowed

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-08 Thread Eron Wright
Thanks, the narrowed FLIP-167 is fine for now. I'll re-activate the vote process. Thanks! On Tue, Jun 8, 2021 at 3:01 AM Till Rohrmann wrote: > Hi everyone, > > I do agree that Flink's definition of idleness is not fully thought through > yet. Consequently, I would feel a bit uneasy to make it

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-08 Thread Till Rohrmann
Hi everyone, I do agree that Flink's definition of idleness is not fully thought through yet. Consequently, I would feel a bit uneasy to make it part of Flink's API right now. Instead, defining the proper semantics first and then exposing it sounds like a good approach forward. Hence, +1 for optio

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-08 Thread Piotr Nowojski
Hi Eron, The FLIP-167 is narrow, but we recently discovered some problems with current idleness semantics as Arvid explained. We are planning to present a new proposal to redefine them. Probably as a part of it, we would need to rename them. Given that, I think it doesn't make sense to expose idle

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-07 Thread Eron Wright
> > >> > > > > > > > > >> > gist is a new `markIdle` method on the `StreamOperator` > > > > interface, > > > > > > > >> > > > > > > > > >> > that > > > > > > > >

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-07 Thread Arvid Heise
> > > >> > Please let me know if you'd like me to proceed to update the > > > FLIP > > > > > > >> > > > > > > > >> > with > > > > > > >> > > > > > > >

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
o > > > > > >> > > > > > > >> > be > > > > > >> > > > > > > >> > encoded. > > > > > >> > > > > > > >> > It seems like this point was asked, but not followe

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Arvid Heise
]. > > > > >> > If you start it, I can already cast a binding vote and we just > > > > >> > > > > > >> > need 2 > > > > >> > > > > > >> > more > > > > >> > > >

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
>> > wrote: > > > >> > > > > >> > > > > >> > Arvid, > > > >> > Thanks for the feedback. I investigated the japicmp > > > >> > > > > >> > configuration, > > > >> >

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Arvid Heise
> >> > > > >> > 1. fair point. It still feels odd to have writeWatermark in the > > >> > SinkFunction (it's supposed to be functional as you mentioned), > > >> > > > >> > but I > > >> > > > &

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
gt;> > > >> > is really just context information of that particular record and > >> > > >> > I > >> > > >> > don't > >> > > >> > see any use beyond timestamp and watermark. For SinkFunction, I'd > >> >

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Arvid Heise
io >> > >> > .invalid> >> > wrote: >> > >> > >> > Arvid, >> > >> > 1. I assume that the method name `invoke` stems from >> > >> > considering >> > >> > the >> > >> > Sink

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Arvid Heise
he same as for elements. > > > > Thanks, > > Eron > > > > On Tue, May 25, 2021 at 12:42 PM Arvid Heise > > > wrote: > > > > Hi Eron, > > > > I think the FLIP is crisp and mostly good to go. Some smaller > > things/qu

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Piotr Nowojski
14 (with this FLIP) or vice versa? >5. What happens with sinks that use the new methods in > > applications > > that > >do not have watermarks (batch mode, processing time)? Does > > this > > also > > work >with ingestion time sufficiently? >

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Piotr Nowojski
>>> programming model is more unified in that respect since 1.12 > > >>>>>> (FLIP-134). > > >>>>>>>> 6. The failure behavior is the same as for elements. > > >>>>>>>> > > >>>>>&

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Eron Wright
go. Some smaller > >>>>>>>>> things/questions: > >>>>>>>>> > >>>>>>>>>1. SinkFunction#writeWatermark could be named > >>>>>>>>>SinkFunction#invokeWatermark or invokeOnWa

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-04 Thread Dawid Wysakowicz
;>>>>>>>StreamingFileSink bucket policies. We may add that >>> processing >>>>> time >>>>>>>> flag >>>>>>>>>also to SinkWriter#Context in the future. >>>>>>>>>3. Alter

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-03 Thread Eron Wright
tly once sinks deal with written watermarks > in > > > > case > > > > > of > > > > > > > >failure? I guess it's the same as normal records. (Either > > > > rollback > > > > > > of > > > > > &

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-03 Thread Arvid Heise
gt; > > > > > wrote: > > > > > > > > > > > > > > > Does anyone have further comment on FLIP-167? > > > > > > > > > > > > > > > > > > > > > > > >

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-03 Thread Eron Wright
> > > > > > Eron > > > > > > > > > > > > > > > > > > > > > On Thu, May 20, 2021 at 5:02 PM Eron Wright < > > > ewri...@streamnative.io > > > > > > > > > > > > wrote: > &g

Re: [DISCUSS] Watermark propagation with Sink API

2021-06-03 Thread Piotr Nowojski
for Sink API: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-167%3A+Watermarks+for+Sink+API > > > > > >

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-30 Thread Arvid Heise
; > > > > > > > I'd like to call a vote next week, is that reasonable? > > > > > > > > > > > > > > > > > > On Wed, May 19, 2021 at 6:28 PM Zhou, Brian > > wrote: > > > > > > > > > > > >

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-28 Thread Eron Wright
cussion and I read through Eron's pull request > > and I > > > > >> think this can benefit Pravega Flink connector as well. > > > > >> > > > > >> Here is some background. Pravega had the watermark concept through > > the > >

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-28 Thread Arvid Heise
nted to propagate the Flink watermark to Pravega in the > > > >> SinkFunction, and at that time we just used the existing Flink API > > that > > > we > > > >> keep the last watermark in memory and check if watermark changes for > > > each >

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-26 Thread Eron Wright
existing Flink API > that > > we > > >> keep the last watermark in memory and check if watermark changes for > > each > > >> event[2] which is not efficient. With such new interface, we can also > > >> manage the watermark propagation much more

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-25 Thread Arvid Heise
ich is not efficient. With such new interface, we can also > >> manage the watermark propagation much more easily. > >> > >> [1] https://pravega.io/blog/2019/11/08/pravega-watermarking-support/ > >> [2] > >> > https://github.com/pravega/fl

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-25 Thread Eron Wright
/src/main/java/io/pravega/connectors/flink/FlinkPravegaWriter.java#L465 >> >> -----Original Message- >> From: Arvid Heise >> Sent: Wednesday, May 19, 2021 16:06 >> To: dev >> Subject: Re: [DISCUSS] Watermark propagation with Sink API >> >> >> [

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-20 Thread Eron Wright
in/java/io/pravega/connectors/flink/FlinkPravegaWriter.java#L465 > > -Original Message- > From: Arvid Heise > Sent: Wednesday, May 19, 2021 16:06 > To: dev > Subject: Re: [DISCUSS] Watermark propagation with Sink API > > > [EXTERNAL EMAIL] > > Hi Eron, > > Than

RE: [DISCUSS] Watermark propagation with Sink API

2021-05-19 Thread Zhou, Brian
flink/FlinkPravegaWriter.java#L465 -Original Message- From: Arvid Heise Sent: Wednesday, May 19, 2021 16:06 To: dev Subject: Re: [DISCUSS] Watermark propagation with Sink API [EXTERNAL EMAIL] Hi Eron, Thanks for pushing that topic. I can now see that the benefit is even bigger t

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-19 Thread Arvid Heise
Hi Eron, Thanks for pushing that topic. I can now see that the benefit is even bigger than I initially thought. So it's worthwhile anyways to include that. I also briefly thought about exposing watermarks to all UDFs, but here I really have an issue to see specific use cases. Could you maybe take

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-18 Thread Eron Wright
Update: opened an issue and a PR. https://issues.apache.org/jira/browse/FLINK-22700 https://github.com/apache/flink/pull/15950 On Tue, May 18, 2021 at 10:03 AM Eron Wright wrote: > Thanks Arvid and David for sharing your ideas on this subject. I'm glad > to hear that you're seeing use cases f

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-18 Thread Eron Wright
Thanks Arvid and David for sharing your ideas on this subject. I'm glad to hear that you're seeing use cases for watermark propagation via an enhanced sink interface. As you've guessed, my interest is in Pulsar and am exploring some options for brokering watermarks across stream processing pipeli

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-18 Thread David Morávek
Hi Eron, Thanks for starting this discussion. I've been thinking about this recently as we've run into "watermark related" issues, when chaining multiple pipelines together. My to cents to the discussion: How I like to think about the problem, is that there should an invariant that holds for any

Re: [DISCUSS] Watermark propagation with Sink API

2021-05-17 Thread Arvid Heise
Hi Eron, I think this is a useful addition for storage systems that act as pass-through for Flink to reduce recovery time. It is only useful if you combine it with regional fail-over as only a small part of the pipeline is restarted. A couple of thoughts on the implications: 1. Watermarks are clo

[DISCUSS] Watermark propagation with Sink API

2021-05-17 Thread Eron Wright
I would like to propose an enhancement to the Sink API, the ability to receive upstream watermarks. I'm aware that the sink context provides the current watermark for a given record. I'd like to be able to write a sink function that is invoked whenever the watermark changes. Out of scope would