Are you running a fixed number of iterations or do you use a dynamic
termination criterion?
For fixed iterations, you can get the id of the current iteration ... which
allows you to find out when you are running the last iterations.

Would it be feasible for you to just log these statistics to the log file?
You can retrieve the statistics once the job has finished.

On Mon, Jun 15, 2015 at 7:32 AM, Nam-Luc Tran <namluc.t...@euranova.eu>
wrote:

> Hi Ufuk,
>
> The kind of things we'd like to log are: time spent in the iteration,
> residual of the algorithm (convergence), current iteration.
>
> Best regards,
>
> Tran Nam-Luc
>
>
> At Monday, 15/06/2015 on 16:15 Ufuk Celebi wrote:
>
> Hey Tran Nam-Luc,
>
> there is currently no way to do this.
>
> The iteration sync tasks keeps track of iteration convergence/max
> number of iterations and signals termination to the iteration head.
> After this, the head flushes the produced result to the next task
> (after the iteration) and the intermediate iteration tasks finish w/o
> calling close again.
>
> Because there is no "final" no-op iteration happening, the iteration
> tasks don't know when the last iteration happened.
>
> I'm not sure what the best way is to implement this at the moment.
>
> What kind of stats are you recording?
>
> – Ufuk
>
> On 15 Jun 2015, at 15:53, Nam-Luc Tran  wrote:
>
> > Hello Everyone,
> >
> > I would like to log certain stats during iterations in a bulk
> > iterative job. The way I do this is store the things I want at each
> > iteration and plan to flush everything to HDFS once all the
> iterations
> > are done. To do that I would need to know when the last iteration is
> > invoked in order to flush the data. However, the close() method in
> the
> > RichMapFunction is executed at the end of each iteration.
> >
> > Is there anyway to know when I am in the last invocation? Or would
> you
> > have a better suggestion to achieve what I am trying to do?
> >
> > Thank you and best regards,
> >
> > Tran Nam-Luc
> >
> >
>
>
>

Reply via email to