[ https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stavros Kontopoulos updated FLINK-5588: --------------------------------------- Description: So far ML has two scalers: min-max and the standard scaler. A third one frequently used, is the scaler to unit. We could implement a transformer for this type of scaling for different norms available to the user. I will make a separate class for the Normalization per sample procedure by using the Transformer API because it is easy to add it, fit method does nothing in this case. Scikit-learn has also some calls available outside the Transform API, we might want add that in the future. These calls work on any axis but they are not re-usable in a pipeline [4] Right now the existing scalers in Flink ML support per feature normalization by using the Transformer API. Resources [1] https://en.wikipedia.org/wiki/Feature_scaling [2] http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html [4] http://scikit-learn.org/stable/modules/preprocessing.html was: So far ML has two scalers: min-max and the standard scaler. A third one frequently used, is the scaler to unit. We could implement a transformer for this type of scaling for different norms available to the user. I will make a separate class for the Normalization procedure by using the Transformer API because it is easy to add it, fit method does nothing in this case. Scikit-learn has also some calls available outside the Transform API, we might want add that in the future. These calls work on any axis but they are not re-usable in a pipeline [4] Right now the existing scalers in Flink ML support per feature normalization by using the Transformer API. Resources [1] https://en.wikipedia.org/wiki/Feature_scaling [2] http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html [4] http://scikit-learn.org/stable/modules/preprocessing.html > Add a unit scaler based on different norms > ------------------------------------------ > > Key: FLINK-5588 > URL: https://issues.apache.org/jira/browse/FLINK-5588 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library > Reporter: Stavros Kontopoulos > Assignee: Stavros Kontopoulos > Priority: Minor > > So far ML has two scalers: min-max and the standard scaler. > A third one frequently used, is the scaler to unit. > We could implement a transformer for this type of scaling for different norms > available to the user. > I will make a separate class for the Normalization per sample procedure by > using the Transformer API because it is easy to add > it, fit method does nothing in this case. > Scikit-learn has also some calls available outside the Transform API, we > might want add that in the future. > These calls work on any axis but they are not re-usable in a pipeline [4] > Right now the existing scalers in Flink ML support per feature normalization > by using the Transformer API. > Resources > [1] https://en.wikipedia.org/wiki/Feature_scaling > [2] > http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html > [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html > [4] http://scikit-learn.org/stable/modules/preprocessing.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)