Hi Jeff,

This is a common case of stream-table join, in that the joining results
depending on the arrival ordering from these two sources.

In Kafka Streams you can try to "synchronize" multiple input streams
through the "TimestampExtractor" interface, which is used to assign a
timestamp to each record polled from Kafka to start the processing. You
can, for example, set the timestamps for your KStream within a later time
interval and the timestamps for your KTable stream with an earlier time
interval, so that the records from table are likely to be processed first.
Note that this is an best effort, in that we cannot guarantee global
ordering across streams while processing, that if you have a much later
record coming from KTable then it will not block earlier records from
KStream from being processed first. But we think this mechanism should be
sufficient in practice.

Let me know if it fits with your scenario, and if not we can talk about how
it can be possibly improved.

Guozhang

On Wed, Mar 30, 2016 at 8:14 AM, Jeff Klukas <jklu...@simple.com> wrote:

> I have a KStream that I want to enrich with some values from a lookup
> table. When a new key enters the KStream, there's likely to be a
> corresponding entry arriving on the KStream at the same time, so we end up
> with a race condition. If the KTable record arrives first, then its value
> is available for joining when the corresponding arrives on the KStream.  If
> the KStream record arrives first, however, we'll get a null join even if
> the KTable gets the corresponding record only milliseconds later.
>
> I'd like to give the KTable a "head start" of ~10 seconds, so that it gets
> a chance to get updated before the corresponding records arrive on the
> KStream. Could I achieve this using one of the existing Windowing
> strategies?
>



-- 
-- Guozhang

Reply via email to