This paper might help you in this regard. I believe they have some public source code available to do that.
http://www.eng.auburn.edu/~xqin/pubs/hcw10.pdf -Faraz On Mon, Dec 6, 2010 at 10:29 AM, yipeng <yip...@gmail.com> wrote: > Dear all, > > I understand that specifying data placement may not be in the spirit of > Hadoop, however I would like to explicitly locate data input on various > nodes to assert the value of data locality and cost of network transfer. > > Does anyone have any advice about how I can go about the data placement? > > Thanks, > > Yipeng >