I have run into that issue too, but only when the data were not
pre-processed correctly.  E.g., if a categorical feature is binary with
values in {-1, +1} instead of {0,1}.  Will be very interested to learn if
it can occur elsewhere!


On Thu, Aug 14, 2014 at 10:16 AM, Sameer Tilak <ssti...@live.com> wrote:

>
> Hi Yanbo,
> I think it was happening because some of the rows did not have all the
> columns. We are cleaning up the data and will let you know once we confirm
> this.
>
> ------------------------------
> Date: Thu, 14 Aug 2014 22:50:58 +0800
> Subject: Re: java.lang.UnknownError: no bin was found for continuous
> variable.
> From: yanboha...@gmail.com
> To: ssti...@live.com
>
> Can you supply the detail code and data you used.
> From the log, it looks like can not find the bin for specific feature.
> The bin for continuous feature is a unit that covers a specific range of
> the feature.
>
>
> 2014-08-14 7:43 GMT+08:00 Sameer Tilak <ssti...@live.com>:
>
> Hi All,
>
> I am using the decision tree algorithm and I get the following error. Any
> help would be great!
>
>
> java.lang.UnknownError: no bin was found for continuous variable.
>  at
> org.apache.spark.mllib.tree.DecisionTree$.findBin$1(DecisionTree.scala:492)
> at
> org.apache.spark.mllib.tree.DecisionTree$.org$apache$spark$mllib$tree$DecisionTree$$findBinsForLevel$1(DecisionTree.scala:529)
>  at
> org.apache.spark.mllib.tree.DecisionTree$$anonfun$3.apply(DecisionTree.scala:653)
> at
> org.apache.spark.mllib.tree.DecisionTree$$anonfun$3.apply(DecisionTree.scala:653)
>  at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>  at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
>  at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.aggregate(TraversableOnce.scala:201)
>  at scala.collection.AbstractIterator.aggregate(Iterator.scala:1157)
> at org.apache.spark.rdd.RDD$$anonfun$21.apply(RDD.scala:838)
>  at org.apache.spark.rdd.RDD$$anonfun$21.apply(RDD.scala:838)
> at org.apache.spark.SparkContext$$anonfun$23.apply(SparkContext.scala:1116)
>  at
> org.apache.spark.SparkContext$$anonfun$23.apply(SparkContext.scala:1116)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>  at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>  at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:744)
> 14/08/13 16:36:06 ERROR ExecutorUncaughtExceptionHandler: Uncaught
> exception in thread Thread[Executor task launch worker-0,5,main]
> java.lang.UnknownError: no bin was found for continuous variable.
> at
> org.apache.spark.mllib.tree.DecisionTree$.findBin$1(DecisionTree.scala:492)
> at
> org.apache.spark.mllib.tree.DecisionTree$.org$apache$spark$mllib$tree$DecisionTree$$findBinsForLevel$1(DecisionTree.scala:529)
>  at
> org.apache.spark.mllib.tree.DecisionTree$$anonfun$3.apply(DecisionTree.scala:653)
> at
> org.apache.spark.mllib.tree.DecisionTree$$anonfun$3.apply(DecisionTree.scala:653)
>  at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>  at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
>  at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1157)
> at
> scala.collection.TraversableOnce$class.aggregate(TraversableOnce.scala:201)
>  at scala.collection.AbstractIterator.aggregate(Iterator.scala:1157)
> at org.apache.spark.rdd.RDD$$anonfun$21.apply(RDD.scala:838)
>  at org.apache.spark.rdd.RDD$$anonfun$21.apply(RDD.scala:838)
> at org.apache.spark.SparkContext$$anonfun$23.apply(SparkContext.scala:1116)
>  at
> org.apache.spark.SparkContext$$anonfun$23.apply(SparkContext.scala:1116)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>  at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>  at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:744)
>
>
>

Reply via email to