How was this cluster configured? Run dfsadmin -report to see the aggregate
configured cache capacity as seen by the NN. You need to configure some GBs
of cache on each DN and also raise the ulimit for max locked memory.

I'll also note that you are unlikely to see speedups with most mapreduce
jobs, especially running against TextInputFormat. There are a lot of copies
and string splitting, so it's typically not I/O bound.

The fs -tail command is likely also spending a lot of time on startup
costs, so I wouldn't expect much end-to-end latency savings.

Best,
Andrew


On Sun, Mar 9, 2014 at 5:15 AM, Azuryy Yu <azury...@gmail.com> wrote:

> add dev.
> ---------- Forwarded message ----------
> From: "hwpstorage" <hwpstor...@gmail.com>
> Date: Mar 7, 2014 11:38 PM
> Subject: problem with HDFS caching in Hadoop 2.3
> To: <u...@hadoop.apache.org>
> Cc:
>
> Hello,
>
> It looks like the HDFS caching does not work well.
> The cached log file is around 200MB. The hadoop cluster has 3 nodes, each
> has 4GB memory.
>
> -bash-4.1$ hdfs cacheadmin -addPool wptest1
> Successfully added cache pool wptest1.
>
> -bash-4.1$ /hadoop/hadoop-2.3.0/bin/hdfs cacheadmin -listPools
> Found 1 result.
> NAME     OWNER  GROUP  MODE            LIMIT  MAXTTL
> wptest1  hdfs   hdfs   rwxr-xr-x   unlimited   never
>
> -bash-4.1$ hdfs cacheadmin -addDirective -path hadoop003.log -pool wptest1
> Added cache directive 1
>
> -bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -tail hadoop003.log
> real    0m2.796s
> user    0m4.263s
> sys     0m0.203s
>
> -bash-4.1$  time /hadoop/hadoop-2.3.0/bin/hadoop fs -tail hadoop003.log
> real    0m3.050s
> user    0m4.176s
> sys     0m0.192s
>
> It is weird that the cache status shows 0 byte cached:-bash-4.1$
> /hadoop/hadoop-2.3.0/bin/hdfs cacheadmin -listDirectives -stats -path
> hadoop003.log -pool wptest1
> Found 1 entry
> ID POOL      REPL EXPIRY  PATH                       BYTES_NEEDED
> BYTES_CACHED  FILES_NEEDED  FILES_CACHED
>   1 wptest1      1 never   /user/hdfs/hadoop003.log
> 209715206             0             1             0
>
> -bash-4.1$ file /hadoop/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0
> /hadoop/hadoop-2.3.0/lib/native/libhadoop.so.1.0.0: ELF 64-bit LSB shared
> object, x86-64, version 1 (SYSV), dynamically linked, not stripped
>
> I also tried the word count example with the same file. The execution time
> is always 40 seconds. (The map/reduce job without cache is 42 seconds)
> Is there anything wrong?
> Thanks a lot
>

Reply via email to