Hi Nam-Luc! Having per-iteration statistics and accumulators is on the roadmap.
The way I have done this so far is to create accumulators like shown below, which creates a new accumulator for each superstep: class MyFunction extends RichMapFunction<Long, Long>{ private LongCounter counter; public void open(Configuration cfg) { counter = getRuntimeContext().getLongCounter("counter" + getIterationRuntimeContext().getSuperstepNumber()) } . . . } On Sun, Jun 21, 2015 at 1:35 AM, Robert Metzger <rmetz...@apache.org> wrote: > Are you running a fixed number of iterations or do you use a dynamic > termination criterion? > For fixed iterations, you can get the id of the current iteration ... which > allows you to find out when you are running the last iterations. > > Would it be feasible for you to just log these statistics to the log file? > You can retrieve the statistics once the job has finished. > > On Mon, Jun 15, 2015 at 7:32 AM, Nam-Luc Tran <namluc.t...@euranova.eu> > wrote: > > > Hi Ufuk, > > > > The kind of things we'd like to log are: time spent in the iteration, > > residual of the algorithm (convergence), current iteration. > > > > Best regards, > > > > Tran Nam-Luc > > > > > > At Monday, 15/06/2015 on 16:15 Ufuk Celebi wrote: > > > > Hey Tran Nam-Luc, > > > > there is currently no way to do this. > > > > The iteration sync tasks keeps track of iteration convergence/max > > number of iterations and signals termination to the iteration head. > > After this, the head flushes the produced result to the next task > > (after the iteration) and the intermediate iteration tasks finish w/o > > calling close again. > > > > Because there is no "final" no-op iteration happening, the iteration > > tasks don't know when the last iteration happened. > > > > I'm not sure what the best way is to implement this at the moment. > > > > What kind of stats are you recording? > > > > – Ufuk > > > > On 15 Jun 2015, at 15:53, Nam-Luc Tran wrote: > > > > > Hello Everyone, > > > > > > I would like to log certain stats during iterations in a bulk > > > iterative job. The way I do this is store the things I want at each > > > iteration and plan to flush everything to HDFS once all the > > iterations > > > are done. To do that I would need to know when the last iteration is > > > invoked in order to flush the data. However, the close() method in > > the > > > RichMapFunction is executed at the end of each iteration. > > > > > > Is there anyway to know when I am in the last invocation? Or would > > you > > > have a better suggestion to achieve what I am trying to do? > > > > > > Thank you and best regards, > > > > > > Tran Nam-Luc > > > > > > > > > > > > >