In the "new" ALS intermediate RDDs (including the ratings input RDD after transforming to block-partitioned ratings) is cached using intermediateRDDStorageLevel, and you can select the final RDD storage level (for user and item factors) using finalRDDStorageLevel.
The old MLLIB API now calls the new ALS so the same semantics apply. So it should not be necessary to cache the raw input RDD. On Tue, 9 Feb 2016 at 01:48 Roberto Pagliari <roberto.pagli...@asos.com> wrote: > When using ALS from mllib, would it be better/recommended to cache the > ratings RDD? > > I’m asking because when predicting products for users (for example) it is > recommended to cache product/user matrices. > > Thank you, > >