On Fri, Nov 19, 2010 at 10:10 AM, Gilles Sadowski < gil...@harfang.homelinux.org> wrote:
> > But seriously, the long running programs in Mahout are almost all > map-reduce > > jobs and there is a fairly good framework for > > progress reporting in Hadoop. This includes normal logging as well as a > > counter framework that allows code to drive status > > counters in parallel out to a standard web interface showing the status > of a > > job. > > That's nice! We'll make CM depend on Hadoop and mandate that every > algorithm implements a servlet interface. ;-) > Well, you don't need the servlet part. And seriously, having a counter style listener interface wouldn't be such a bad thing. It doesn't require much framework to stub out. > > > The only non-hadoop long-running job is the stochastic gradient descent > > modeling stuff. There, we use simple logging [...] > > > Simple logging is readily achievable in CM. We just depend on "slf4j-api" > and we are done. > Yes. > > > [...] with > > a list of tab separated values on a specially marked log line. These > can > > be extracted using tail and grep to provide progress > > plots. > > How is this different from what I proposed in my last post (apart that a > compatible logger implementation, as hinted at, would be much more > porwerful > than "grep"ping)? > It isn't much different. I just was responding to a direct question about what Mahout does.