[ https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stavros Kontopoulos updated FLINK-5588: --------------------------------------- Description: So far ML has two scalers: min-max and the standard scaler. A third one frequently used, is the scaler to unit. We could implement a transformer for this type of scaling for different norms available to the user. I will make a separate class for the Normalization procedure by using the Transformer API because it is easy to add it, fit method does nothing in this case. Scikit-learn has also some calls available outside the Transform API, we might want add that in the future. These calls work on any axis but they are not re-usable in a pipeline [4] Right now the existing scalers in Flink ML support per feature normalization by using the Transforer API. Resources [1] https://en.wikipedia.org/wiki/Feature_scaling [2] http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html [4] http://scikit-learn.org/stable/modules/preprocessing.html was: So far ML has two scalers: min-max and the standard scaler. A third one frequently used, is the scaler to unit. We could implement a transformer for this type of scaling for different norms available to the user. Axis for scaling either features or samples (0 for columns-features 1 for samples-rows). I will make this a separate class for the Normalization procedure by using the Transformer API. Scikit-learn has also some calls available outside the Transform API, we might want add that in the future. Right now the existing scalers in Flink ML support per feature normalization by using the Transforer API. Resources [1] https://en.wikipedia.org/wiki/Feature_scaling [2] http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html [4] http://scikit-learn.org/stable/modules/preprocessing.html > Add a unit scaler based on different norms > ------------------------------------------ > > Key: FLINK-5588 > URL: https://issues.apache.org/jira/browse/FLINK-5588 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library > Reporter: Stavros Kontopoulos > Assignee: Stavros Kontopoulos > Priority: Minor > > So far ML has two scalers: min-max and the standard scaler. > A third one frequently used, is the scaler to unit. > We could implement a transformer for this type of scaling for different norms > available to the user. > I will make a separate class for the Normalization procedure by using the > Transformer API because it is easy to add > it, fit method does nothing in this case. > Scikit-learn has also some calls available outside the Transform API, we > might want add that in the future. > These calls work on any axis but they are not re-usable in a pipeline [4] > Right now the existing scalers in Flink ML support per feature normalization > by using the Transforer API. > Resources > [1] https://en.wikipedia.org/wiki/Feature_scaling > [2] > http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html > [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html > [4] http://scikit-learn.org/stable/modules/preprocessing.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)