Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-12-08 Thread Ewan Higgs
Sean, Thanks. It's a developer API and doesn't appear to be exposed. Ewan On 07/12/15 15:06, Sean Owen wrote: I'm not sure if this is available in Python but from 1.3 on you should be able to call ALS.setFinalRDDStorageLevel with level "none" to ask it to unpersist when it is done. On Mon, De

Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-12-07 Thread Sean Owen
I'm not sure if this is available in Python but from 1.3 on you should be able to call ALS.setFinalRDDStorageLevel with level "none" to ask it to unpersist when it is done. On Mon, Dec 7, 2015 at 1:42 PM, Ewan Higgs wrote: > Jonathan, > Did you ever get to the bottom of this? I have some users wo

Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-12-07 Thread Ewan Higgs
Jonathan, Did you ever get to the bottom of this? I have some users working with Spark in a classroom setting and our example notebooks run into problems where there is so much spilled to disk that they run out of quota. A 1.5G input set becomes >30G of spilled data on disk. I looked into how I

Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-07-27 Thread Xiangrui Meng
> Thank you, > Ilya Ganelin > > > > > -Original Message- > From: Stahlman, Jonathan [jonathan.stahl...@capitalone.com] > Sent: Wednesday, July 22, 2015 01:42 PM Eastern Standard Time > To: user@spark.apache.org > Subject: Re: How to unpersist RDDs gene

RE: How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-07-22 Thread Ganelin, Ilya
talone.com<mailto:jonathan.stahl...@capitalone.com>] Sent: Wednesday, July 22, 2015 01:42 PM Eastern Standard Time To: user@spark.apache.org Subject: Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel Hello again, In trying to understand the caching of intermediate RDDs by ALS, I looked into

Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-07-22 Thread Stahlman, Jonathan
o:user@spark.apache.org>" mailto:user@spark.apache.org>> Subject: Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel Hi Jonathan, I believe calling persist with StorageLevel.NONE doesn't do anything. That's why the unpersist has an if statement befo

Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-07-22 Thread Burak Yavuz
> > > This doesn’t make sense to me – I would expect the RDDs to be removed from > the cache if finalRDDStorageLevel == StorageLevel.NONE, not the other way > around. > > Jonathan > > > From: , Stahlman Jonathan > Date: Thursday, July 16, 2015 at 2:18 PM >

Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-07-22 Thread Stahlman, Jonathan
. Jonathan From: , Stahlman Jonathan mailto:jonathan.stahl...@capitalone.com>> Date: Thursday, July 16, 2015 at 2:18 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" mailto:user@spark.apache.org>> Subject: How to unpersist RDDs generated by ALS/MatrixFactoriz

How to unpersist RDDs generated by ALS/MatrixFactorizationModel

2015-07-16 Thread Stahlman, Jonathan
Hello all, I am running the Spark recommendation algorithm in MLlib and I have been studying its output with various model configurations. Ideally I would like to be able to run one job that trains the recommendation model with many different configurations to try to optimize for performance.