Hi all : I try to use the memory file system in hadoop. the idea is very simple. I want to use memory file system to the map intermediate file. It is like this; 1. the memory is limited, the data will be written into the disk. 2.If the file in memory is deleted and there are space in memory, the data will be prefetched by a thread into the memory.3.If the data is not in memory, then read it directly from disk.
But when I try to implement it in hadoop. I find that when the tasktracker receive a new map or reduce task, it will start a new process. If I use the memory file system, the intermediate file will be written into map task process address space. And task tracker can't access to it. So any suggestions? Thanks a lot :)