The simplest way would be to have your job send a message to another kafka 
topic when it’s done.

Rick



> On Dec 1, 2015, at 3:44 PM, Anton Polyakov <polyakov.an...@gmail.com> wrote:
> 
> Hi
> 
> I am looking at Samza to process some incoming stream of trades. Processing 
> pipeline is a complex DAG where some nodes might create zero to many 
> descendant events. Ultimately they got to the end sink (and now these are 
> completely different events all originated by one source trade)
> 
> I am looking for a answer to quite fundamental (in my view) question - given 
> source trade, how could I know if it was fully processed by DAG or is it 
> still in flight? It is called lineage tracking in Storm afaik.
> 
> In my situation when trade event is fired I need a reliable way to determine 
> when processing is done in order to treat results. I googled a lot of 
> algorithms - checkpointing in Flink, transactions in Storm. Checkpointing in 
> Flink could work but unfortunately I can’t use them from my source to mark a 
> trade with checkpoint barrier - API is not available. Looking at Samza 
> there’s nothing obvious either.
> 
> I have a very uncomfortable feeling that the problem should be very basic and 
> fundamental, maybe I am using wrong terms to explain it. take it to extreme - 
> one can send 1 event to Samza and processing is takes 15 minutes on some 
> node. How on the end of the pipeline should I know when its done?
> 
> Thank you

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to