[ https://issues.apache.org/jira/browse/FLINK-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann updated FLINK-2162: --------------------------------- Assignee: Ventura Del Monte > Implement adaptive learning rate strategies for SGD > --------------------------------------------------- > > Key: FLINK-2162 > URL: https://issues.apache.org/jira/browse/FLINK-2162 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library > Reporter: Till Rohrmann > Assignee: Ventura Del Monte > Priority: Minor > Labels: ML > > At the moment, the SGD implementation uses a simple adaptive learning rate > strategy, {{adaptedLearningRate = > initialLearningRate/sqrt(iterationNumber)}}, which makes the optimization > algorithm sensitive to the setting of the {{initialLearningRate}}. If this > value is chosen wrongly, then the SGD might become instable. > There are better ways to calculate the learning rate [1] such as Adagrad [3], > Adadelta [4], SGD with momentum [5] others [2]. They promise to result in > more stable optimization algorithms which don't require so much > hyperparameter tweaking. It might be worthwhile to investigate these > approaches. > It might also be interesting to look at the implementation of vowpal wabbit > [6]. > Resources: > [1] [http://imgur.com/a/Hqolp] > [2] [http://cs.stanford.edu/people/karpathy/convnetjs/demo/trainers.html] > [3] [http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf] > [4] [http://www.matthewzeiler.com/pubs/googleTR2012/googleTR2012.pdf] > [5] [http://www.willamette.edu/~gorr/classes/cs449/momrate.html] > [6] [https://github.com/JohnLangford/vowpal_wabbit] -- This message was sent by Atlassian JIRA (v6.3.4#6332)