Hi Rick

But processing pipeline is more than 1 job working in parallel, not
serially. This is why single "done" flag is not enough. Imagine a situation
when 1 trade creates 3 sub-events and they are processed concurreently by 3
instances of the job. Then I need at least to wait for 3 flags. And if
pipeline is more complex, it can be more flags all of which I have to know
in advance - so I will stick to some known topology and if it changes,
"finishing" condition also changes

On Wednesday, December 2, 2015, Rick Mangi <r...@chartbeat.com> wrote:

> The simplest way would be to have your job send a message to another kafka
> topic when it’s done.
>
> Rick
>
>
>
> > On Dec 1, 2015, at 3:44 PM, Anton Polyakov <polyakov.an...@gmail.com
> <javascript:;>> wrote:
> >
> > Hi
> >
> > I am looking at Samza to process some incoming stream of trades.
> Processing pipeline is a complex DAG where some nodes might create zero to
> many descendant events. Ultimately they got to the end sink (and now these
> are completely different events all originated by one source trade)
> >
> > I am looking for a answer to quite fundamental (in my view) question -
> given source trade, how could I know if it was fully processed by DAG or is
> it still in flight? It is called lineage tracking in Storm afaik.
> >
> > In my situation when trade event is fired I need a reliable way to
> determine when processing is done in order to treat results. I googled a
> lot of algorithms - checkpointing in Flink, transactions in Storm.
> Checkpointing in Flink could work but unfortunately I can’t use them from
> my source to mark a trade with checkpoint barrier - API is not available.
> Looking at Samza there’s nothing obvious either.
> >
> > I have a very uncomfortable feeling that the problem should be very
> basic and fundamental, maybe I am using wrong terms to explain it. take it
> to extreme - one can send 1 event to Samza and processing is takes 15
> minutes on some node. How on the end of the pipeline should I know when its
> done?
> >
> > Thank you
>
>

Reply via email to