rerun person.count and you will see the performance of cache. 

person.cache would not cache it right now. It'll actually cache this RDD after 
one action[person.count here] 

----- 原始邮件 -----

发件人: fightf...@163.com 
收件人: "user" <user@spark.apache.org> 
发送时间: 星期三, 2015年 4 月 01日 下午 1:21:25 
主题: rdd.cache() not working ? 

Hi, all 

Running the following code snippet through spark-shell, however cannot see any 
cached storage partitions in web ui. 

Does this mean that cache now working ? Cause if we issue person.count again 
that we cannot say any time consuming 

performance upgrading. Hope anyone can explain this for a little. 

Best, 

Sun. 

case class Person(id: Int, col1: String) 

val person = 
sc.textFile("hdfs://namenode_host:8020/user/person.txt").map(_.split(",")).map(p
 => Person(p(0).trim.toInt, p(1))) 
person.cache 
person.count 


fightf...@163.com 



-- 


--------------------------------------------------------------------------- 

Thanks & Best regards 

李涛涛 Taotao · Li | Fixed Income@Datayes | Software Engineer 

地址:上海市浦东新区陆家嘴西路 99 号万向大厦8 楼, 200120 
Address :Wanxiang Towen 8 F, Lujiazui West Rd. No.99, Pudong New District, 
Shanghai, 200120 

电话 |Phone : 021-60216502 手机 |Mobile: +86-18202171279 

Reply via email to