Hi Rachana, I got the same exception. It is because computing the feature importance depends on impurity stats, which is not calculated with the old RandomForestModel in MLlib. Feel free to create a JIRA for this if you think it is necessary, otherwise I believe this problem will be eventually solved as part of this JIRA https://issues.apache.org/jira/browse/SPARK-12183
Bryan On Thu, Jan 14, 2016 at 8:12 AM, Rachana Srivastava < rachana.srivast...@markmonitor.com> wrote: > Tried using 1.6 version of Spark that takes numberOfFeatures fifth > argument in the API but still getting featureImportance as null. > > > > RandomForestClassifier rfc = *getRandomForestClassifier*( numTrees, > maxBinSize, maxTreeDepth, seed, impurity); > > RandomForestClassificationModel rfm = RandomForestClassificationModel. > *fromOld*(model, rfc, categoricalFeatures, numberOfClasses, > numberOfFeatures); > > System.*out*.println(rfm.featureImportances()); > > > > Stack Trace: > > Exception in thread "main" *java.lang.NullPointerException* > > at > org.apache.spark.ml.tree.impl.RandomForest$.computeFeatureImportance(RandomForest.scala:1152) > > at > org.apache.spark.ml.tree.impl.RandomForest$$anonfun$featureImportances$1.apply(RandomForest.scala:1111) > > at > org.apache.spark.ml.tree.impl.RandomForest$$anonfun$featureImportances$1.apply(RandomForest.scala:1108) > > at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > > at > scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) > > at > org.apache.spark.ml.tree.impl.RandomForest$.featureImportances(RandomForest.scala:1108) > > at > org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances$lzycompute(RandomForestClassifier.scala:237) > > at > org.apache.spark.ml.classification.RandomForestClassificationModel.featureImportances(RandomForestClassifier.scala:237) > > at > com.markmonitor.antifraud.ce.ml.CheckFeatureImportance.main( > *CheckFeatureImportance.java:49*) > > > > *From:* Rachana Srivastava > *Sent:* Wednesday, January 13, 2016 3:30 PM > *To:* 'u...@spark.apache.org'; 'dev@spark.apache.org' > *Subject:* Random Forest FeatureImportance throwing NullPointerException > > > > I have a Random forest model for which I am trying to get the > featureImportance vector. > > > > Map<Object,Object> categoricalFeaturesParam = *new* HashMap<>(); > > scala.collection.immutable.Map<Object,Object> categoricalFeatures = > (scala.collection.immutable.Map<Object,Object>) > > scala.collection.immutable.Map$.*MODULE$*.apply(JavaConversions. > *mapAsScalaMap*(categoricalFeaturesParam).toSeq()); > > *int* numberOfClasses =2; > > RandomForestClassifier rfc = *new* RandomForestClassifier(); > > RandomForestClassificationModel rfm = RandomForestClassificationModel. > *fromOld*(model, rfc, categoricalFeatures, numberOfClasses); > > System.*out*.println(rfm.featureImportances()); > > > > When I run above code I found featureImportance as null. Do I need to set > anything in specific to get the feature importance for the random forest > model. > > > > Thanks, > > > > Rachana >