Hi David What happens if you provide the class labels via metadata instead of letting OneVsRest determine the labels?
Ram On Mon, Jan 25, 2016 at 3:06 PM, David Brooks <da...@whisk.co.uk> wrote: > Hi, > > I've run into an exception using MLlib OneVsRest with logistic regression > (v1.6.0, but also in previous versions). > > The issue is intermittent. When running multiclass classification with > K-fold cross validation, there are scenarios where the split does not > contain instances for every target label. In such cases, an > ArrayIndexOutOfBoundsException is generated. > > I've tried to reproduce the problem in a simple SBT project here: > > https://github.com/junglebarry/SparkOneVsRestTest > > I don't imagine this is typical - it first surfaced when running over a > dataset with some very rare classes. > > I'm happy to look into patching the code, but I first wanted to confirm > that the problem was real, and that I wasn't somehow misunderstanding how I > should be using OneVsRest. > > Any guidance would be appreciated - I'm new to the list. > > Many thanks, > David > -- Ram Sriharsha Architect, Spark and Data Science Hortonworks, 2550 Great America Way, 2nd Floor Santa Clara, CA 95054 Ph: 408-510-8635 email: har...@apache.org [image: https://www.linkedin.com/in/harsha340] <https://www.linkedin.com/in/harsha340> <https://twitter.com/halfabrane> <https://github.com/harsha2010/>