Hi everyone,
I created a FLIP and started a discussion around that topic [1].
Best,
Arvid
[1]
https://lists.apache.org/thread.html/r8357d64b9cfdf5a233c53a20d9ac62b75c07c925ce2c43e162f1e39c%40%3Cdev.flink.apache.org%3E
On Thu, Jun 10, 2021 at 9:18 PM Eron Wright
wrote:
> I quickly updated the
I quickly updated the draft PR that would propagate idleness information to
the Sink function, based on the recent improvement provided by
FLINK-18934. For illustration purposes.
https://github.com/streamnative/flink/pull/2
On Thu, Jun 10, 2021 at 11:34 AM Eron Wright
wrote:
> Regarding records
Regarding records vs watermarks, I feel it is wrong to include records in
the considerations, because the clearest definition of idleness (IMO) is
'active participation in advancing the event-time clock', and records don't
directly affect the clock. Of course, records indirectly influence the
cloc
Thanks for providing these details Gordon. I have to admit that I do not
fully follow the reasoning why periodic watermark generators forced us to
define idleness for records. Is it because the idleness was generated based
on the non-availability of more data in the sources and not in the
watermark
Forgot to provide the link to the [1] reference:
[1] https://issues.apache.org/jira/browse/FLINK-5017
On Thu, Jun 10, 2021 at 11:43 AM Tzu-Li (Gordon) Tai
wrote:
> Hi everyone,
>
> Sorry for chiming in late here.
>
> Regarding the topic of changing the definition of StreamStatus and
> changing
Hi everyone,
Sorry for chiming in late here.
Regarding the topic of changing the definition of StreamStatus and changing
the name as well:
After digging into some of the roots of this implementation [1], initially
the StreamStatus was actually defined to mark "watermark idleness", and not
"record
Hi Eron,
again to recap from the other thread:
- You are right that idleness is correct with static assignment and fully
active partitions. In this case, the source defines idleness. (case A)
- For the more pressing use cases of idle, assigned partitions, the user
defines an idleness threshold, an
Hi Eron,
Can you elaborate a bit more what do you mean? I don’t understand what do you
mean by more general solution.
As of now, stream is marked idle by a source/watermark generator, which has an
effect of temporarily ignoring this stream/partition from calculating min
watermark in the downs
It seems to me that idleness was introduced to deal with a very specific
issue. In the pipeline, watermarks are aggregated not on a per-split basis
but on a per-subtask basis. This works well when each subtask has exactly
one split. When a sub-task has multiple splits, various complications
occu
Hi Arvid,
Thanks for writing down this summary and proposal. I think this was the
foundation of the disagreement in FLIP-167 discussion. Dawid was arguing
that idleness is intermittent, strictly a task local concept and as such
shouldn't be exposed in for example sinks. While me and Eron thought t
10 matches
Mail list logo