Hi Mikhail, I have followed the MLP user-guide and used the dataset and network configuration you mentioned. MLP was trained without any issues with default parameters, that is block size of 128 and 100 iterations.
Source code: scala> import org.apache.spark.ml.classification.MultilayerPerceptronClassifier scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc) scala> val data = sqlContext.read.format("libsvm").load("/data/aloi.scale") scala> val trainer = new MultilayerPerceptronClassifier().setLayers(Array(128, 128, 1000)) scala> val model = trainer.fit(data) (after a while) model: org.apache.spark.ml.classification.MultilayerPerceptronClassificationMode l = mlpc_fb3bd70d2ef2 It seems that submitting an Issue is premature. Could you share your code instead? Best regards, Alexander Just in case, here is the link to the user guide: https://spark.apache.org/docs/latest/ml-classification-regression.html#multilayer-perceptron-classifier From: Yanbo Liang [mailto:yblia...@gmail.com] Sent: Monday, July 04, 2016 9:58 PM To: mshiryae <mikhail.shiry...@intel.com> Cc: user@spark.apache.org Subject: Re: Spark MLlib: MultilayerPerceptronClassifier error? Would you mind to file a JIRA to track this issue? I will take a look when I have time. 2016-07-04 14:09 GMT-07:00 mshiryae <mikhail.shiry...@intel.com<mailto:mikhail.shiry...@intel.com>>: Hi, I am trying to train model by MultilayerPerceptronClassifier. It works on sample data from data/mllib/sample_multiclass_classification_data.txt with 4 features, 3 classes and layers [4, 4, 3]. But when I try to use other input files with other features and classes (from here for example: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass.html) then I get errors. Example: Input file aloi (128 features, 1000 classes, layers [128, 128, 1000]): with block size = 1: ERROR StrongWolfeLineSearch: Encountered bad values in function evaluation. Decreasing step size to Infinity ERROR LBFGS: Failure! Resetting history: breeze.optimize.FirstOrderException: Line search failed ERROR LBFGS: Failure again! Giving up and returning. Maybe the objective is just poorly behaved? with default block size = 128: java.lang.ArrayIndexOutOfBoundsException at java.lang.System.arraycopy(Native Method) at org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(Layer.scala:629) at org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(Layer.scala:628) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3.apply(Layer.scala:628) at org.apache.spark.ml.ann.DataStacker$$anonfun$3$$anonfun$apply$3.apply(Layer.scala:624) Even if I modify sample_multiclass_classification_data.txt file (rename all 4-th features to 5-th) and run with layers [5, 5, 3] then I also get the same errors as for file above. So to resume: I can't run training with default block size and with more than 4 features. If I set block size to 1 then some actions are happened but I get errors from LBFGS. It is reproducible with Spark 1.5.2 and from master branch on github (from 4-th July). Did somebody already met with such behavior? Is there bug in MultilayerPerceptronClassifier or I use it incorrectly? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-MLlib-MultilayerPerceptronClassifier-error-tp27279.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>