Re: [DISCUSS] FLIP-325: Support configuring end-to-end allowed latency

Shammon FY Thu, 29 Jun 2023 03:03:07 -0700

Hi Dong and Yunfeng,

Thanks for bringing up this discussion.

As described in the FLIP, the differences between `end-to-end latency` and
`table.exec.mini-batch.allow-latency` are: "It allows users to specify the
end-to-end latency, whereas table.exec.mini-batch.allow-latency applies to
each operator. If there are N operators on the path from source to sink,
the end-to-end latency could be up to table.exec.mini-batch.allow-latency *
N".

If I understand correctly, `table.exec.mini-batch.allow-latency` is also
applied to the end-to-end latency for a job, maybe @Jack Wu can give more
information.

So, from my perspective, and please correct me if I'm misunderstand, the
targets of this FLIP may include the following:

1. Support a mechanism like  `mini-batch` in SQL for `DataStream`, which
will collect data in the operator and flush data when it receives a `flush`
event, in the FLIP it is `FlushEvent`.

2. Support dynamic `latency` according to the progress of job, such as
snapshot stage and after that.

To do that, I have some questions:

1. I didn't understand the purpose of public interface `RecordAttributes`.
I think `FlushEvent` in the FLIP is enough, and different
`DynamicFlushStrategy` can be added to generate flush events to address
different needs, such as a static interval similar to mini-batch in SQL or
collect the information `isProcessingBacklog` and metrics to generate
`FlushEvent` which is described in your FLIP? If hudi sink needs the
`isBacklog` flag, the hudi `SplitEnumerator` can create an operator event
and send it to hudi source reader.

2. How is this new mechanism unified with SQL's mini-batch mechanism? As
far as I am concerned, SQL implements mini-batch mechanism based on
watermark, I think it is very unreasonable to have two different
implementation in SQL and DataStream.

3. I notice that the `CheckpointCoordinator` will generate `FlushEvent`,
which information about `FlushEvent` will be stored in `Checkpoint`? What
is the alignment strategy for FlushEvent in the operator? The operator will
flush the data when it receives all `FlushEvent` from upstream with the
same ID or do flush for each `FlushEvent`? Can you give more detailed
proposal about that? We also have a demand for this piece, thanks

Best,
Shammon FY

On Thu, Jun 29, 2023 at 4:35 PM Martijn Visser <martijnvis...@apache.org>
wrote:

> Hi Dong and Yunfeng,
>
> Thanks for the FLIP. What's not clear for me is what's the expected
> behaviour when the allowed latency can't be met, for whatever reason.
> Given that we're talking about an "allowed latency", it implies that
> something has gone wrong and should fail? Isn't this more a minimum
> latency that you're proposing?
>
> There's also the part about the Hudi Sink processing records
> immediately upon arrival. Given that the SinkV2 API provides the
> ability for custom post and pre-commit topologies [1], specifically
> targeted to avoid generating multiple small files, why isn't that
> sufficient for the Hudi Sink? It would be great to see that added
> under Rejected Alternatives if this is indeed not sufficient.
>
> Best regards,
>
> Martijn
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction
>
> On Sun, Jun 25, 2023 at 4:25 AM Yunfeng Zhou
> <flink.zhouyunf...@gmail.com> wrote:
> >
> > Hi all,
> >
> > Dong(cc'ed) and I are opening this thread to discuss our proposal to
> > support configuring end-to-end allowed latency for Flink jobs, which
> > has been documented in FLIP-325
> > <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-325%3A+Support+configuring+end-to-end+allowed+latency
> >.
> >
> > By configuring the latency requirement for a Flink job, users would be
> > able to optimize the throughput and overhead of the job while still
> > acceptably increasing latency. This approach is particularly useful
> > when dealing with records that do not require immediate processing and
> > emission upon arrival.
> >
> > Please refer to the FLIP document for more details about the proposed
> > design and implementation. We welcome any feedback and opinions on
> > this proposal.
> >
> > Best regards.
> >
> > Dong and Yunfeng
>

Re: [DISCUSS] FLIP-325: Support configuring end-to-end allowed latency

Reply via email to