Hi David

What happens if you provide the class labels via metadata instead of
letting OneVsRest determine the labels?

Ram

On Mon, Jan 25, 2016 at 3:06 PM, David Brooks <da...@whisk.co.uk> wrote:

> Hi,
>
> I've run into an exception using MLlib OneVsRest with logistic regression
> (v1.6.0, but also in previous versions).
>
> The issue is intermittent.  When running multiclass classification with
> K-fold cross validation, there are scenarios where the split does not
> contain instances for every target label.  In such cases, an
> ArrayIndexOutOfBoundsException is generated.
>
> I've tried to reproduce the problem in a simple SBT project here:
>
>    https://github.com/junglebarry/SparkOneVsRestTest
>
> I don't imagine this is typical - it first surfaced when running over a
> dataset with some very rare classes.
>
> I'm happy to look into patching the code, but I first wanted to confirm
> that the problem was real, and that I wasn't somehow misunderstanding how I
> should be using OneVsRest.
>
> Any guidance would be appreciated - I'm new to the list.
>
> Many thanks,
> David
>



-- 
Ram Sriharsha
Architect, Spark and Data Science
Hortonworks, 2550 Great America Way, 2nd Floor
Santa Clara, CA 95054
Ph: 408-510-8635
email: har...@apache.org

[image: https://www.linkedin.com/in/harsha340]
<https://www.linkedin.com/in/harsha340> <https://twitter.com/halfabrane>
<https://github.com/harsha2010/>

Reply via email to