Hi! I want to normalize features before train logistic regression. I setup scaler:
scaler2 = StandardScaler(withMean=True, withStd=True).fit(features) and apply it to a dataset: scaledData = dataset.map(lambda x: LabeledPoint(x.label, scaler2.transform(Vectors.dense(x.features.toArray() )))) but I can't work with scaledData (can't output it or train regression on it), got an error: Exception: It appears that you are attempting to reference SparkContext from a b roadcast variable, action, or transforamtion. SparkContext can only be used on t he driver, not in code that it run on workers. For more information, see SPARK-5 063. Does it correct code to make normalization? Why it doesn't work? Any advices are welcome. Thanks. Full code: https://gist.github.com/dkozyr/d31551a3ebed0ee17772 Console output: https://gist.github.com/dkozyr/199f0d4f44cf522f9453 Denys --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
