Sounds good. Thanks for clarifying.
-Matthias
On 2/6/25 3:19 PM, Almog Gavra wrote:
Good call on the backwards compatibility - updated the KIP.
Re: the grace period for BatchWindows, I think zero makes sense (and also
makes implementing things a lot easier). In my mental model, we still drop
Good call on the backwards compatibility - updated the KIP.
Re: the grace period for BatchWindows, I think zero makes sense (and also
makes implementing things a lot easier). In my mental model, we still drop
late records that come in after the window closes, they just never happen
because we use
Hit "reply" too early. Just re-read the KIP.
For `Windows#windowsFor(...)`, even if not intended to be implement by
users, it's strictly public API. Thus, we cannot just change the method,
but would need to keep the existing method and deprecate it, and add a
new overload with a default impl t
BatchWindows works for me.
On 2/6/25 7:34 AM, Almog Gavra wrote:
Happy to name it BatchWindows. Will give some people time to chime in and
then change the name.
- Almog
On Tue, Feb 4, 2025 at 11:10 PM Sophie Blee-Goldman
wrote:
One minor suggestion: use BatchWindows instead of BatchedWindo
Happy to name it BatchWindows. Will give some people time to chime in and
then change the name.
- Almog
On Tue, Feb 4, 2025 at 11:10 PM Sophie Blee-Goldman
wrote:
> One minor suggestion: use BatchWindows instead of BatchedWindows. The
> version without the "ed" matches up with the established n
One minor suggestion: use BatchWindows instead of BatchedWindows. The
version without the "ed" matches up with the established naming pattern and
grammar used by other Windows classes: eg TimeWindows, SessionWindows,
SlidingWindows
Not a big deal though, won't redact my +1 on the voting thread if
Thanks for the discussion everyone! I've updated the Wiki with the
following changes:
- Renamed to BatchedWindows
- Add a note in rejected alternatives about more general purpose
(micro-)batching functionality since the scope of that is much wider.
Since it looks like we've stabilized the discuss
batch window with max N records,
and then also specifying a BufferConfig.maxRecords()
That's actually two different and independent dimensions. "N records"
would be the number of records in the window, but `maxRecords` is the
number of unique keys/row in the buffer before it's flushed.
t
I'm not opposed to "BatchedWindows" - I think I like that the most so far.
I'll let that sit on the discussion thread for a while, and change the KIP
to match if no concerns.
> What I don't understand is, why the relationship to
suppress()/emitStrategy() is relevant? Can you elaborate a little bit
Interesting thoughts. So maybe we could go with `BatchWindows` as a
name? Again, only spit-balling...
If we really put "(micro-)batching" in the center of this idea, I think
both count-based and time-based (and time could actually be either
stream-time or wall-clock-time), or any combination o
Thanks for the feedback Lucas and Bruno!
L0. "Given the motivation section, it sounds we actually want something
that I'd call "batching" rather than "windowing"."
You are right here, and I think ultimately introducing more flexible and
controlled micro-batching will be useful for Kafka Streams,
Hi Almog,
I had similar thoughts as Lucas. When I read the KIP, I asked myself why
are the windows not specified on number of records instead of time if we
do not care about whether the event time of the records is in the time
range of the window?
In your motivation, you write that users mig
Hi Almog,
this seems useful to me. I don't see anything wrong with the details
of the proposal.
More generally, I'd like to hear your thoughts on this vs. batching.
Given the motivation section, it sounds we actually want something
that I'd call "batching" rather than "windowing". If you do not r
Thanks, Almog.
Good call out about `TimeWindows` vs `TimeWindow` (yes, I am aware and
was actually re-reading my previous email before sending it a few times
to make sure I use the right one; it's very subtle.)
For `TimeWindows` semantics are certainly well defined, and there is
nothing to b
Thanks Matthias for the quick and detailed feedback!
> Nit: it seems you are mixing the terms "out-of-order" and "late" and
using them as synonymous, what we usually not do.
M1. Ah, in my mind "late arriving" was after the window closed but
potentially before grace (and "out of order" was just an
Interesting KIP. It's a known problem, and the proposed solution make
sense to me.
Nit: it seems you are mixing the terms "out-of-order" and "late" and
using them as synonymous, what we usually not do.
"Out-of-order" is the more generic term, while "late" means after the
grace period (hence,
Hello!
I'd like to initiate a discussion thread on KIP-1127:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1127+Flexible+Windows+for+Late+Arriving+Data
This KIP aims to make it easier to specify windowing semantics that are
more tolerable to late arriving data, particularly with suppressi
17 matches
Mail list logo