[ 
https://issues.apache.org/jira/browse/FLINK-5588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stavros Kontopoulos updated FLINK-5588:
---------------------------------------
    Description: 
So far ML has two scalers: min-max and the standard scaler.
A third one frequently used, is the scaler to unit.
We could implement a transformer for this type of scaling for different norms 
available to the user.
I will make a separate class for the Normalization procedure by using the 
Transformer API because it is easy to add
it, fit method does nothing in this case.
Scikit-learn has also some calls available outside the Transform API, we might 
want add that in the future.
These calls work on any axis but they are not re-usable in a pipeline [4]
Right now the existing scalers in Flink ML support per feature normalization by 
using the Transforer API. 

Resources
[1] https://en.wikipedia.org/wiki/Feature_scaling
[2] 
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
[3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
[4] http://scikit-learn.org/stable/modules/preprocessing.html

  was:
So far ML has two scalers: min-max and the standard scaler.
A third one frequently used, is the scaler to unit.
We could implement a transformer for this type of scaling for different norms 
available to the user.

Axis for scaling either features or samples (0 for columns-features 1 for 
samples-rows). 
I will make this a separate class for the Normalization procedure by using the 
Transformer API.
Scikit-learn has also some calls available outside the Transform API, we might 
want add that in the future.
Right now the existing scalers in Flink ML support per feature normalization by 
using the Transforer API. 

Resources
[1] https://en.wikipedia.org/wiki/Feature_scaling
[2] 
http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
[3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
[4] http://scikit-learn.org/stable/modules/preprocessing.html


> Add a unit scaler based on different norms
> ------------------------------------------
>
>                 Key: FLINK-5588
>                 URL: https://issues.apache.org/jira/browse/FLINK-5588
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Stavros Kontopoulos
>            Assignee: Stavros Kontopoulos
>            Priority: Minor
>
> So far ML has two scalers: min-max and the standard scaler.
> A third one frequently used, is the scaler to unit.
> We could implement a transformer for this type of scaling for different norms 
> available to the user.
> I will make a separate class for the Normalization procedure by using the 
> Transformer API because it is easy to add
> it, fit method does nothing in this case.
> Scikit-learn has also some calls available outside the Transform API, we 
> might want add that in the future.
> These calls work on any axis but they are not re-usable in a pipeline [4]
> Right now the existing scalers in Flink ML support per feature normalization 
> by using the Transforer API. 
> Resources
> [1] https://en.wikipedia.org/wiki/Feature_scaling
> [2] 
> http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html
> [3] https://spark.apache.org/docs/2.1.0/mllib-feature-extraction.html
> [4] http://scikit-learn.org/stable/modules/preprocessing.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to