Hi group,

I wanted to get your suggestion on how to implement two requirements we have:

*         One is to read from external message queue (JMS) at very fast latency

*         Second is to support zero data loss, so that in case of restart and 
recovery, messages not checkpointed (and not part of state) will be replayed 
again.

(which indicates kind of replayble source)

Because of the first requirement we can't write JMS messages to Kafka first and 
only then read from kafka, because it will increase latency.
Instead we thought to consume the JMS messages and forward them both to job and 
to KafkaSink.
Then in case of failure and recovery, we want to start in recovery mode, and 
read message from offset matching the state\checkpoint.
How can this be done? We though to somehow save in the state the last flushed 
kakfa offset.
The problem is this information is available only via future\interceptor and we 
don't know how to connect it to state, so RecoverySource can use it...

So current suggestion looks something like:

Happy path:
JMSQueue-> JMSConsumer -> JobMessageParser(and additional operators), KafkaSink
(Here maybe we can add ProducerInterceptor-> which saves offset to state 
somehow)

Failure path: (will run before HappyPath to recover data)
RecoverySource-> JobMessageParser(and additional operators)
(Here maybe add Queryable state client which reads offsets from other operator 
state)

Thanks,
Tovi

Reply via email to