To get the best performance from Hadoop, we can configure network topology. Based on that, it will apply RackAwareness algorithms and write/read the files. Also HDFS-2246 will improve performance by reading directly local replicas.
If you have good algorithm and will get good performance than this current algorithms, please file a JIRA with your proposed strategy and design doc. :-) Thanks & Regards, Uma ----- Original Message ----- From: gschen <gongsuoc...@yahoo.com.cn> Date: Tuesday, October 11, 2011 9:26 am Subject: Re: Strategy Of Replica To: common-dev@hadoop.apache.org > On 2011/10/11 11:13, Uma Maheswara Rao G 72686 wrote: > > I did not get your proposed strategy implementations. > > > > Note that, already you can set the replication levels for files. > If you set less replication, then obviously your perf and space > will get benefits and also risk will be high in this case. I think > we can manage your requirements using that replication factor. > Your expectations are something different that this? > > > > Regards, > > Uma > > ----- Original Message ----- > > From: gschen<gongsuoc...@yahoo.com.cn> > > Date: Tuesday, October 11, 2011 8:14 am > > Subject: Strategy Of Replica > > To: "common-dev@hadoop.apache.org"<common-dev@hadoop.apache.org> > > > >> Hi guys, > >> What do you think of the strategy of replication in hdfs? How about > >> the > >> customized strategy that users customized their strategy of > >> replication > >> such as price, performance and so on? > >> > >> Thank you in advance. > >> > Thanks for your reply. In hdfs only one thing we can do is that we > could > set replication factor to change replication strategy, but we can > not > change where the block is stored and what type of storage that we > stored > the data. Just think this case: In order to improve the > downloading > speed, I can choose my block replication near my location or near > someone's location. I mean that users could have more option to > decide > their block replication strategy. >