Hi Still no good luck with your guide.
Best. Sun. fightf...@163.com From: Yuri Makhno Date: 2015-04-01 15:26 To: fightf...@163.com CC: Taotao.Li; user Subject: Re: Re: rdd.cache() not working ? cache() method returns new RDD so you have to use something like this: val person = sc.textFile("hdfs://namenode_host:8020/user/person.txt").map(_.split(",")).map(p => Person(p(0).trim.toInt, p(1))) val cached = person.cache cached.count when you rerun count on cached you will see that cache works On Wed, Apr 1, 2015 at 9:35 AM, fightf...@163.com <fightf...@163.com> wrote: Hi That is just the issue. After running person.cache we then run person.count however, there still not be any cache performance showed from web ui storage. Thanks, Sun. fightf...@163.com From: Taotao.Li Date: 2015-04-01 14:02 To: fightfate CC: user Subject: Re: rdd.cache() not working ? rerun person.count and you will see the performance of cache. person.cache would not cache it right now. It'll actually cache this RDD after one action[person.count here] 发件人: fightf...@163.com 收件人: "user" <user@spark.apache.org> 发送时间: 星期三, 2015年 4 月 01日 下午 1:21:25 主题: rdd.cache() not working ? Hi, all Running the following code snippet through spark-shell, however cannot see any cached storage partitions in web ui. Does this mean that cache now working ? Cause if we issue person.count again that we cannot say any time consuming performance upgrading. Hope anyone can explain this for a little. Best, Sun. case class Person(id: Int, col1: String) val person = sc.textFile("hdfs://namenode_host:8020/user/person.txt").map(_.split(",")).map(p => Person(p(0).trim.toInt, p(1))) person.cache person.count fightf...@163.com -- --------------------------------------------------------------------------- Thanks & Best regards 李涛涛 Taotao · Li | Fixed Income@Datayes | Software Engineer 地址:上海市浦东新区陆家嘴西路99号万向大厦8楼, 200120 Address :Wanxiang Towen 8F, Lujiazui West Rd. No.99, Pudong New District, Shanghai, 200120 电话|Phone:021-60216502 手机|Mobile: +86-18202171279