All,

I am a newbie and trying to learn the Hadoop internals. I have got a few 
questions on something that I am trying to implement. I am still in the 
learning process, so these questions may seem silly or may even be entirely 
wrong, any guidance is highly appreciated. 

I am working on release 0.18.3 and trying to do some job scheduling, that is: 
instead of placing the data blocks on specific/desired nodes when input files 
are copied into HDFS, I am trying to move the blocks from their original 
locations to these desired/highly efficient nodes just before job submission.

My questions are:
1. Will such a change improve the performance? Considering the overhead caused 
by moving the data blocks.
2. I believe I will have to start from the NameNode to move the blocks. If 
anyone can give me a brief explanation on the process to implement this or even 
sources to find information on this it would be very helpful.

Thanks,
Arun

 




      

Reply via email to