Re: StackOverflow Error when run ALS with 100 iterations

Xiaoli Li Tue, 15 Apr 2014 18:55:31 -0700

Thanks a lot for your information. It really helps me.


On Tue, Apr 15, 2014 at 7:57 PM, Cheng Lian <lian.cs....@gmail.com> wrote:

> Probably this JIRA 
> issue<https://spark-project.atlassian.net/browse/SPARK-1006>solves your 
> problem. When running with large iteration number, the lineage
> DAG of ALS becomes very deep, both DAGScheduler and Java serializer may
> overflow because they are implemented in a recursive way. You may resort to
> checkpointing as a workaround.
>
>
> On Wed, Apr 16, 2014 at 5:29 AM, Xiaoli Li <lixiaolima...@gmail.com>wrote:
>
>> Hi,
>>
>> I am testing ALS using 7 nodes. Each node has 4 cores and 8G memeory. ALS
>> program cannot run  even with a very small size of training data (about 91
>> lines) due to StackVverFlow error when I set the number of iterations to
>> 100. I think the problem may be caused by updateFeatures method which
>> updates products RDD iteratively by join previous products RDD.
>>
>>
>> I am writing a program which has a similar update process with ALS.  This
>> problem also appeared when I iterate too many times (more than 80).
>>
>> The iterative part of my code is as following:
>>
>> solution = outlinks.join(solution). map {
>>      .......
>>  }
>>
>>
>> Has anyone had similar problem?  Thanks.
>>
>>
>> Xiaoli
>>
>
>

Re: StackOverflow Error when run ALS with 100 iterations

Reply via email to