Hi, all - First off, I want to say that I love spark and am very excited about MLBase. I'd love to contribute now that I have some time, but before I do that I'd like to familiarize myself with the process.
In looking for a few projects and settling on one which I'll discuss in another thread, I found some very minor optimizations I could contribute, again, as part of this first step. Before I initiate a PR, I've gone ahead and tested style, ran tests, etc per the instructions, but I'd still like to have someone quickly glance over it and ensure that these are JIRA worthy. Commit: https://github.com/izendejas/spark/commit/81065aed9987c1b08cd5784b7a6153e26f3f7402 To summarize: * I got rid of some SeqLike.reverse calls when sorting by descending order * replaced slice(1, length) calls with the much safer (avoids IOOBEs) and more readable .tail calls * used a foldleft to avoid using mutable variables in NaiveBayes code This last one is meant to understand what's valued more between idiomatic Scala development or readability. I'm personally a fan of foldLefts where applicable, but do think they're a bit less readable. Thanks, Ignacio