Hi, > Thanks Arun. Change the mTime is a good idea. However, given a file (the path > is > > A/B/C/D/file) distributed to all the nodes, if I just change the mTime of file > to a earlier time stamp, it will not be replaced next time. Should I also > change > the mTime for all the directories along the path (A, B, C and D). Whose > timestamp is used by DistributedCache?
It is the timestamp of the file on DFS. So, you modify the file's timestamp on DFS, it should be re-distributed to all the nodes. Thanks Hemanth > > Thanks. > -Gang > > > > > ----- 原始邮件 ---- > 发件人: Arun C Murthy <a...@yahoo-inc.com> > 收件人: mapreduce-u...@hadoop.apache.org > 发送日期: 2010/8/22 (周日) 9:38:02 下午 > 主 题: Re: where distributed cache start working > > Moving to mapreduce-user@, bcc common-...@. Please use the project specific > lists. > > DistributedCache.purgeCache isn't a public api. You shouldn't be calling it > from > > the task. > > A simple way of doing what you want is to change the mtime of the cache files > on > > HDFS. > > Arun > > On Aug 22, 2010, at 9:48 AM, Gang Luo wrote: > >> Thanks Jeff. >> >> However, are you sure TaskRunner.run() is also used in the new API? I use >>btrace >> to trace the function call but didn't find this function had been called >> anywhere. >> >> >> One more question about distributed cache. After I call >> DistributedCache.purgeCache, I think the local cached files should be deleted >>or >> invalidated. However ,When I run the same job with the purge operation at the >> end multiple times, I find the local files have never been deleted and the >> modification time is when the first job run. How can I ask my job to >> re-distributed the cache again anyway? >> >> Thanks, >> -Gang >> >> >> >> >> ----- 原始邮件 ---- >> 发件人: Jeff Zhang <zjf...@gmail.com> >> 收件人: common-dev@hadoop.apache.org >> 发送日期: 2010/8/20 (周五) 11:22:49 上午 >> 主 题: Re: where distributed cache start working >> >> Hi Gang, >> >> In the TaskRunner's run() method, hadoop will download the cache files >> which you set on the client side to local, then the forked child jvm >> can use these cache files locally. >> >> >> >> On Fri, Aug 20, 2010 at 8:08 AM, Gang Luo <lgpub...@yahoo.com.cn> wrote: >>> Hi all, >>> I go through the code, but couldn't find the place where distributed cache >>> start >>> working. I want to know between DistriubtedCache.addCacheFile at the master >>> node >>> and DistributedCache.getLocalCacheFiles at the client side, when and where > are >>> the files get distributed. >>> >>> >>> Thanks, >>> -Gang >>> >>> >>> >>> >>> >> >> >> >> --Best Regards >> >> Jeff Zhang >> >> >> >> > > > > >