[How-To][SQL] Create a dataframe inside the TableScan.buildScan method of a relation

2017-06-22 Thread OBones
Hello, I'm trying to extend Spark so that it can use our own binary format as a read-only source for pipeline based computations. I already have a java class that gives me enough elements to build a complete StructType with enough metadata (NominalAttribute for instance). It also gives me the r

Re: Handling nulls in vector columns is non-trivial

2017-06-22 Thread Franklyn D'souza
We've developed Scala UDFs internally to address some of these issues and we'd love to upstream them back to spark. Just trying to figure out what the vector support looks like on the road map. would it be best to put this functionality into the Imputer, VectorAssembler or maybe try to give it mor

Fwd: A question about rdd transformation

2017-06-22 Thread Lionel Luffy
add dev list. Who can help on below question? Thanks & Best Regards, LL -- Forwarded message -- From: Lionel Luffy Date: Fri, Jun 23, 2017 at 11:20 AM Subject: Re: A question about rdd transformation To: u...@spark.apache.org Now I found the root cause is a Wrapper class in Any