[ https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stavros Kontopoulos updated FLINK-5588: --------------------------------------- Description: So far ML has two scalers: min-max and the standard scaler. A third one frequently used, is the scaler to unit. We could implement a transformer for this type of scaling for different norms available to the user. Axis for scaling either features or samples (0 for columns-features 1 for samples-rows). Right now the existing scalers support per feature normalization. I think its trivial to add per sample normalization. Resources [1] https://en.wikipedia.org/wiki/Feature_scaling [2] http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html was: So far ML has two scalers: min-max and the standard. A third one frequently used, is the scaler to unit. We could implement a transformer for this type of scaling for different norms available to the user. Axis for scaling either features or samples (0 for columns-features 1 for samples-rows). Right now the existing scalers support per feature normalization. I think its trivial to add per sample normalization. Resources [1] https://en.wikipedia.org/wiki/Feature_scaling [2] http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html > Add a unit scaler based on different norms > ------------------------------------------ > > Key: FLINK-5588 > URL: https://issues.apache.org/jira/browse/FLINK-5588 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library > Reporter: Stavros Kontopoulos > Assignee: Stavros Kontopoulos > Priority: Minor > > So far ML has two scalers: min-max and the standard scaler. > A third one frequently used, is the scaler to unit. > We could implement a transformer for this type of scaling for different norms > available to the user. > Axis for scaling either features or samples (0 for columns-features 1 for > samples-rows). > Right now the existing scalers support per feature normalization. I think its > trivial to add per sample normalization. > Resources > [1] https://en.wikipedia.org/wiki/Feature_scaling > [2] > http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html > [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)