[ 
https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stavros Kontopoulos updated FLINK-5588:
---------------------------------------
    Description: 
So far ML has two scalers: min-max and the standard scaler.
A third one frequently used, is the scaler to unit.
We could implement a transformer for this type of scaling for different norms 
available to the user.

Axis for scaling either features or samples (0 for columns-features 1 for 
samples-rows). 
I will make this a separate class for the Normalization procedure by using the 
Transformer API.
Scikit-learn has also some calls available outside the Transform API, we might 
want add that in the future.
Right now the existing scalers in Flink ML support per feature normalization by 
using the Transforer API. 

Resources
[1] https://en.wikipedia.org/wiki/Feature_scaling
[2] 
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
[3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
[4] http://scikit-learn.org/stable/modules/preprocessing.html

  was:
So far ML has two scalers: min-max and the standard scaler.
A third one frequently used, is the scaler to unit.
We could implement a transformer for this type of scaling for different norms 
available to the user.
Axis for scaling either features or samples (0 for columns-features 1 for 
samples-rows). 
Right now the existing scalers support per feature normalization. I think its 
trivial to add per sample normalization.

Resources
[1] https://en.wikipedia.org/wiki/Feature_scaling
[2] 
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
[3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html


> Add a unit scaler based on different norms
> ------------------------------------------
>
>                 Key: FLINK-5588
>                 URL: https://issues.apache.org/jira/browse/FLINK-5588
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Stavros Kontopoulos
>            Assignee: Stavros Kontopoulos
>            Priority: Minor
>
> So far ML has two scalers: min-max and the standard scaler.
> A third one frequently used, is the scaler to unit.
> We could implement a transformer for this type of scaling for different norms 
> available to the user.
> Axis for scaling either features or samples (0 for columns-features 1 for 
> samples-rows). 
> I will make this a separate class for the Normalization procedure by using 
> the Transformer API.
> Scikit-learn has also some calls available outside the Transform API, we 
> might want add that in the future.
> Right now the existing scalers in Flink ML support per feature normalization 
> by using the Transforer API. 
> Resources
> [1] https://en.wikipedia.org/wiki/Feature_scaling
> [2] 
> http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
> [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
> [4] http://scikit-learn.org/stable/modules/preprocessing.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to