I'd also say that running for 100 iterations is a waste of resources, as ALS will typically converge pretty quickly, as in within 10-20 iterations.
On Wed, Apr 16, 2014 at 3:54 AM, Xiaoli Li <lixiaolima...@gmail.com> wrote: > Thanks a lot for your information. It really helps me. > > > On Tue, Apr 15, 2014 at 7:57 PM, Cheng Lian <lian.cs....@gmail.com> wrote: > >> Probably this JIRA >> issue<https://spark-project.atlassian.net/browse/SPARK-1006>solves your >> problem. When running with large iteration number, the lineage >> DAG of ALS becomes very deep, both DAGScheduler and Java serializer may >> overflow because they are implemented in a recursive way. You may resort to >> checkpointing as a workaround. >> >> >> On Wed, Apr 16, 2014 at 5:29 AM, Xiaoli Li <lixiaolima...@gmail.com>wrote: >> >>> Hi, >>> >>> I am testing ALS using 7 nodes. Each node has 4 cores and 8G memeory. >>> ALS program cannot run even with a very small size of training data (about >>> 91 lines) due to StackVverFlow error when I set the number of iterations to >>> 100. I think the problem may be caused by updateFeatures method which >>> updates products RDD iteratively by join previous products RDD. >>> >>> >>> I am writing a program which has a similar update process with ALS. >>> This problem also appeared when I iterate too many times (more than 80). >>> >>> The iterative part of my code is as following: >>> >>> solution = outlinks.join(solution). map { >>> ....... >>> } >>> >>> >>> Has anyone had similar problem? Thanks. >>> >>> >>> Xiaoli >>> >> >> >