Re: StackOverflow Error when run ALS with 100 iterations

Nick Pentreath Wed, 16 Apr 2014 00:15:23 -0700

I'd also say that running for 100 iterations is a waste of resources, as
ALS will typically converge pretty quickly, as in within 10-20 iterations.



On Wed, Apr 16, 2014 at 3:54 AM, Xiaoli Li <lixiaolima...@gmail.com> wrote:

> Thanks a lot for your information. It really helps me.
>
>
> On Tue, Apr 15, 2014 at 7:57 PM, Cheng Lian <lian.cs....@gmail.com> wrote:
>
>> Probably this JIRA 
>> issue<https://spark-project.atlassian.net/browse/SPARK-1006>solves your 
>> problem. When running with large iteration number, the lineage
>> DAG of ALS becomes very deep, both DAGScheduler and Java serializer may
>> overflow because they are implemented in a recursive way. You may resort to
>> checkpointing as a workaround.
>>
>>
>> On Wed, Apr 16, 2014 at 5:29 AM, Xiaoli Li <lixiaolima...@gmail.com>wrote:
>>
>>> Hi,
>>>
>>> I am testing ALS using 7 nodes. Each node has 4 cores and 8G memeory.
>>> ALS program cannot run  even with a very small size of training data (about
>>> 91 lines) due to StackVverFlow error when I set the number of iterations to
>>> 100. I think the problem may be caused by updateFeatures method which
>>> updates products RDD iteratively by join previous products RDD.
>>>
>>>
>>> I am writing a program which has a similar update process with ALS.
>>> This problem also appeared when I iterate too many times (more than 80).
>>>
>>> The iterative part of my code is as following:
>>>
>>> solution = outlinks.join(solution). map {
>>>      .......
>>>  }
>>>
>>>
>>> Has anyone had similar problem?  Thanks.
>>>
>>>
>>> Xiaoli
>>>
>>
>>
>

Re: StackOverflow Error when run ALS with 100 iterations

Reply via email to