Hi Manoj, Done.
https://issues.apache.org/jira/browse/SPARK-9277 On Thu, Jul 23, 2015 at 1:02 PM, Manoj Kumar <manojkumarsivaraj...@gmail.com > wrote: > Hi, > > I think this should raise an error both in the scala code and python API. > > Please open a JIRA. > > On Thu, Jul 23, 2015 at 4:22 PM, Andrew Vykhodtsev <yoz...@gmail.com> > wrote: > >> Dear Developers, >> >> I found that one can create SparseVector inconsistently and it will lead >> to an Java error in runtime, for example when training >> LogisticRegressionWithSGD. >> >> Here is the test case: >> >> >> In [2]: >> sc.version >> Out[2]: >> u'1.3.1' >> In [13]: >> from pyspark.mllib.linalg import SparseVector >> from pyspark.mllib.regression import LabeledPoint >> from pyspark.mllib.classification import LogisticRegressionWithSGD >> In [3]: >> x = SparseVector(2, {1:1, 2:2, 3:3, 4:4, 5:5}) >> In [10]: >> l = LabeledPoint(0, x) >> In [12]: >> r = sc.parallelize([l]) >> In [14]: >> m = LogisticRegressionWithSGD.train(r) >> >> Error: >> >> >> Py4JJavaError: An error occurred while calling >> o86.trainLogisticRegressionModelWithSGD. >> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 >> in stage 11.0 failed 1 times, most recent failure: Lost task 7.0 in stage >> 11.0 (TID 47, localhost): *java.lang.ArrayIndexOutOfBoundsException: 2* >> >> >> >> Attached is the notebook with the scenario and the full message: >> >> >> >> Should I raise a JIRA for this (forgive me if there is such a JIRA and I did >> not notice it) >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> For additional commands, e-mail: dev-h...@spark.apache.org >> > > > > -- > Godspeed, > Manoj Kumar, > http://manojbits.wordpress.com > <http://goog_1017110195> > http://github.com/MechCoder >