Re: VectorAssembler handling null values

2016-04-20 Thread Andres Perez
at 1:38 AM, Nick Pentreath wrote: > Could you provide an example of what your input data looks like? > Supporting missing values in a sparse result vector makes sense. > > On Tue, 19 Apr 2016 at 23:55, Andres Perez wrote: > >> Hi everyone. org.apache.spark.ml.feature.Vect

VectorAssembler handling null values

2016-04-19 Thread Andres Perez
Hi everyone. org.apache.spark.ml.feature.VectorAssembler currently cannot handle null values. This presents a problem for us as we wish to run a decision tree classifier on sometimes sparse data. Is there a particular reason VectorAssembler is implemented in this way, and can anyone recommend the b

Re: stopped SparkContext remaining active

2015-07-29 Thread Andres Perez
5 at 1:10 PM, Ted Yu wrote: > bq. it seems like we never get to the clearActiveContext() call by the end > > Looking at stop() method, there is only one early return > after stopped.compareAndSet() call. > Is there any clue from driver log ? > > Cheers > > On Wed, Jul 29

stopped SparkContext remaining active

2015-07-29 Thread Andres Perez
Hi everyone. I'm running into an issue with SparkContexts when running on Yarn. The issue is observable when I reproduce these steps in the spark-shell (version 1.4.1): scala> sc res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext@7b965dee *Note the pointer address of sc. (Then y