I have run into that issue too, but only when the data were not pre-processed correctly. E.g., if a categorical feature is binary with values in {-1, +1} instead of {0,1}. Will be very interested to learn if it can occur elsewhere!
On Thu, Aug 14, 2014 at 10:16 AM, Sameer Tilak <ssti...@live.com> wrote: > > Hi Yanbo, > I think it was happening because some of the rows did not have all the > columns. We are cleaning up the data and will let you know once we confirm > this. > > ------------------------------ > Date: Thu, 14 Aug 2014 22:50:58 +0800 > Subject: Re: java.lang.UnknownError: no bin was found for continuous > variable. > From: yanboha...@gmail.com > To: ssti...@live.com > > Can you supply the detail code and data you used. > From the log, it looks like can not find the bin for specific feature. > The bin for continuous feature is a unit that covers a specific range of > the feature. > > > 2014-08-14 7:43 GMT+08:00 Sameer Tilak <ssti...@live.com>: > > Hi All, > > I am using the decision tree algorithm and I get the following error. Any > help would be great! > > > java.lang.UnknownError: no bin was found for continuous variable. > at > org.apache.spark.mllib.tree.DecisionTree$.findBin$1(DecisionTree.scala:492) > at > org.apache.spark.mllib.tree.DecisionTree$.org$apache$spark$mllib$tree$DecisionTree$$findBinsForLevel$1(DecisionTree.scala:529) > at > org.apache.spark.mllib.tree.DecisionTree$$anonfun$3.apply(DecisionTree.scala:653) > at > org.apache.spark.mllib.tree.DecisionTree$$anonfun$3.apply(DecisionTree.scala:653) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144) > at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.aggregate(TraversableOnce.scala:201) > at scala.collection.AbstractIterator.aggregate(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$21.apply(RDD.scala:838) > at org.apache.spark.rdd.RDD$$anonfun$21.apply(RDD.scala:838) > at org.apache.spark.SparkContext$$anonfun$23.apply(SparkContext.scala:1116) > at > org.apache.spark.SparkContext$$anonfun$23.apply(SparkContext.scala:1116) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) > at org.apache.spark.scheduler.Task.run(Task.scala:51) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > 14/08/13 16:36:06 ERROR ExecutorUncaughtExceptionHandler: Uncaught > exception in thread Thread[Executor task launch worker-0,5,main] > java.lang.UnknownError: no bin was found for continuous variable. > at > org.apache.spark.mllib.tree.DecisionTree$.findBin$1(DecisionTree.scala:492) > at > org.apache.spark.mllib.tree.DecisionTree$.org$apache$spark$mllib$tree$DecisionTree$$findBinsForLevel$1(DecisionTree.scala:529) > at > org.apache.spark.mllib.tree.DecisionTree$$anonfun$3.apply(DecisionTree.scala:653) > at > org.apache.spark.mllib.tree.DecisionTree$$anonfun$3.apply(DecisionTree.scala:653) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144) > at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.aggregate(TraversableOnce.scala:201) > at scala.collection.AbstractIterator.aggregate(Iterator.scala:1157) > at org.apache.spark.rdd.RDD$$anonfun$21.apply(RDD.scala:838) > at org.apache.spark.rdd.RDD$$anonfun$21.apply(RDD.scala:838) > at org.apache.spark.SparkContext$$anonfun$23.apply(SparkContext.scala:1116) > at > org.apache.spark.SparkContext$$anonfun$23.apply(SparkContext.scala:1116) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) > at org.apache.spark.scheduler.Task.run(Task.scala:51) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:744) > > >