Features scaling

Denys Kozyr Tue, 21 Apr 2015 01:27:07 -0700

Hi!

I want to normalize features before train logistic regression. I setup scaler:


scaler2 = StandardScaler(withMean=True, withStd=True).fit(features)

and apply it to a dataset:

scaledData = dataset.map(lambda x: LabeledPoint(x.label,
scaler2.transform(Vectors.dense(x.features.toArray() ))))

but I can't work with scaledData (can't output it or train regression
on it), got an error:

Exception: It appears that you are attempting to reference SparkContext from a b
roadcast variable, action, or transforamtion. SparkContext can only be used on t
he driver, not in code that it run on workers. For more information, see SPARK-5
063.

Does it correct code to make normalization? Why it doesn't work?
Any advices are welcome.
Thanks.

Full code:
https://gist.github.com/dkozyr/d31551a3ebed0ee17772

Console output:
https://gist.github.com/dkozyr/199f0d4f44cf522f9453

Denys

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Features scaling

Reply via email to