I searched on this, but didn't find anything general so I apologize if this
has been addressed. 

Many algorithms (SGD, SVM...) either will not converge or will run forever
if the data is not scaled. Sci-kit has  preprocessing
<http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html>
  
that will subtract the mean and divide by standard dev. Of course there are
a few options with it as well.

Is there something in the works for this?



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Standard-preprocessing-scaling-tp6826.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Reply via email to