Hey Sandy,
The work should be done by a VectorAssembler, which combines multiple
columns (double/int/vector) into a vector column, which becomes the
features column for regression. We can going to create JIRAs for each
of these standard feature transformers. It would be great if you can
help imple
I think there is a minor error here in that the first example needs a
"tail" after the seq:
df.map { row =>
(row.getDouble(0), row.toSeq.tail.map(_.asInstanceOf[Double]))
}.toDataFrame("label", "features")
On Wed, Feb 11, 2015 at 7:46 PM, Michael Armbrust
wrote:
> It sounds like you probably w
It sounds like you probably want to do a standard Spark map, that results
in a tuple with the structure you are looking for. You can then just
assign names to turn it back into a dataframe.
Assuming the first column is your label and the rest are features you can
do something like this:
val df =
Hey All,
I've been playing around with the new DataFrame and ML pipelines APIs and
am having trouble accomplishing what seems like should be a fairly basic
task.
I have a DataFrame where each column is a Double. I'd like to turn this
into a DataFrame with a features column and a label column tha