Hi,
I am a graduate student in Computer Science department at SUNY Stony Brook.
 I am thinking of doing a project on Hadoop for my course "Cloud Computing"
conducted by Prof. Radu Sion.
While going through the links of the "Yahoo open source projects for
students"  page I found the idea
"Research on new hashing schemes for filesystem namespace partitioning"
interesting. It looks to me the idea is
to assign subtree of the whole namespace to one namenode and another subtree
to another namenode.
How  LSH is better than normal hashing?  Because still, a client or a fixed
namenode has to take decision of which namenode to contact in whatever
hashing ? It looks to me that requests to files under same subtree are
directed to the same namenode then the performance will be faster as the
requests to the same namenode are clustered around the a part of namespace
subtree
(For example a part of on which client is doing some operation.) Is this
assumption correct? Can I have more insight in this regard.



Thanks,
Ketan

Reply via email to