Hi Nam-Luc!

Having per-iteration statistics and accumulators is on the roadmap.

The way I have done this so far is to create accumulators like shown below,
which creates a new accumulator for each superstep:


class MyFunction extends RichMapFunction<Long, Long>{

    private LongCounter counter;

    public void open(Configuration cfg) {
        counter = getRuntimeContext().getLongCounter("counter" +
getIterationRuntimeContext().getSuperstepNumber())
    }

    . . .
}




On Sun, Jun 21, 2015 at 1:35 AM, Robert Metzger <rmetz...@apache.org> wrote:

> Are you running a fixed number of iterations or do you use a dynamic
> termination criterion?
> For fixed iterations, you can get the id of the current iteration ... which
> allows you to find out when you are running the last iterations.
>
> Would it be feasible for you to just log these statistics to the log file?
> You can retrieve the statistics once the job has finished.
>
> On Mon, Jun 15, 2015 at 7:32 AM, Nam-Luc Tran <namluc.t...@euranova.eu>
> wrote:
>
> > Hi Ufuk,
> >
> > The kind of things we'd like to log are: time spent in the iteration,
> > residual of the algorithm (convergence), current iteration.
> >
> > Best regards,
> >
> > Tran Nam-Luc
> >
> >
> > At Monday, 15/06/2015 on 16:15 Ufuk Celebi wrote:
> >
> > Hey Tran Nam-Luc,
> >
> > there is currently no way to do this.
> >
> > The iteration sync tasks keeps track of iteration convergence/max
> > number of iterations and signals termination to the iteration head.
> > After this, the head flushes the produced result to the next task
> > (after the iteration) and the intermediate iteration tasks finish w/o
> > calling close again.
> >
> > Because there is no "final" no-op iteration happening, the iteration
> > tasks don't know when the last iteration happened.
> >
> > I'm not sure what the best way is to implement this at the moment.
> >
> > What kind of stats are you recording?
> >
> > – Ufuk
> >
> > On 15 Jun 2015, at 15:53, Nam-Luc Tran  wrote:
> >
> > > Hello Everyone,
> > >
> > > I would like to log certain stats during iterations in a bulk
> > > iterative job. The way I do this is store the things I want at each
> > > iteration and plan to flush everything to HDFS once all the
> > iterations
> > > are done. To do that I would need to know when the last iteration is
> > > invoked in order to flush the data. However, the close() method in
> > the
> > > RichMapFunction is executed at the end of each iteration.
> > >
> > > Is there anyway to know when I am in the last invocation? Or would
> > you
> > > have a better suggestion to achieve what I am trying to do?
> > >
> > > Thank you and best regards,
> > >
> > > Tran Nam-Luc
> > >
> > >
> >
> >
> >
>

Reply via email to