The timestamp is not only used for windowing specs but also for flow
control (i.e. it is used a way of "message chooser" among multiple input
topic partitions), see this section for details:
http://docs.confluent.io/3.0.0/streams/architecture.html#flow-control-with-timestamps
Guozhang
On Fri, M
Guozhang,
Timestamp extraction seems more like a stream level API. I guess its a
better fit as a global options when using WallclockTimestampExtractor
or ConsumerRecordTimestampExtractor.
w.r.t your statement -- "I think setting timestamps for this KTable to make
sure its values is smaller than t
A processor is guaranteed to be executed on the same thread at any given
time, its process() and punctuate() will always be triggered to run in a
single thread.
Currently TimestampExtractor is set globally, but you can definitely define
different logics depending on the topic name (which is includ
Guozhang,
I guess you are referring to a scenario where noOfThreads < totalNoOfTasks.
We could have KTable task and KStream task running on the same thread and
sleep will be counter productive?
On this note, will a Processor always run on the same thread? Are process()
and punctuate() guaranteed t
Srikanth,
Note that the same thread maybe used for fetching both the "semi-static"
KTable stream as well as the continuous KStream stream, so
sleep-on-no-match may not work.
I think setting timestamps for this KTable to make sure its values is
smaller than the KStream stream will work, and there
Thanks Guozhang & Matthias!
For 1), it is going to be a common ask. So a DSL API will be good.
For 2), source for the KTable currently is a file. Think about it as a dump
from a DB table.
We are thinking of ways to stream updates from this table. But for now its
a new file every day or so.
I plan
Hi Srikanth,
as Guozhang mentioned, the problem is the definition of the time, when
your table is read for joining with the stream.
Using transform() you would basically read a changlog-stream within your
custom Transformer class and apply it via KStream.transform() to your
regular stream. (ie, y
Hi Srikanth,
How do you define if the "KTable is read completely" to get to the "current
content"? Since as you said that table is not purely static, but still with
maybe-low-traffic update streams, I guess "catch up to current content" is
still depending on some timestamp?
BTW about 1), we are c
Matthias,
For (2), how do you achieve this using transform()?
Thanks,
Srikanth
On Sat, May 21, 2016 at 9:10 AM, Matthias J. Sax
wrote:
> Hi Srikanth,
>
> 1) there is no support on DSL level, but if you use Processor API you
> can do "anything" you like. So yes, a map-like transform() that gets
Hi Srikanth,
1) there is no support on DSL level, but if you use Processor API you
can do "anything" you like. So yes, a map-like transform() that gets
initialized with the "broadcast-side" of the join should work.
2) Right now, there is no way to stall a stream -- a custom
TimestampExtractor wil
Hello,
I'm writing a workflow using kafka streams where an incoming stream needs
to be denormalized and joined with a few dimension table.
It will be written back to another kafka topic. Fairly typical I believe.
1) Can I do broadcast join if my dimension table is small enough to be held
in each
11 matches
Mail list logo