Steve, A DoS could not be done using excludedNodes.
The blacklisting takes place only at DFSClientLevel. The NN will return a list of block locations that excludes the nodes the client decided. This list isn't persisted anywhere on the server. So if a client excludes the entire set of DNs other clients won't be affected. Cosmin On 1/22/10 5:32 PM, "Steve Loughran" <ste...@apache.org> wrote: > Stack wrote: > > I'm being 0 on this > > -I would worry if the exclusion list was used by the NN to do its > blacklisting, I'm glad to see this isn't happening. Yes, you could pick > up datanode failure faster, but you would also be vulnerable to a user > doing a DoS against the cluster by reporting every DN as failing > > -Russ Perry's work on high-speed Hadoop rendering [1] tweaked Hadoop to > allow the datanodes to get the entire list of nodes holding the data, > and allowed them to make their own decision about where to get the data > from. This > 1. pushed the policy of handling failure down to the clients, less > need to talk to the NN about it. > 2. lets you do something very fancy where you deliberately choose data > from different DNs, so that you can then pull data off the cluster at > the full bandwidth of every disk > > Long term, I would like to see Russ's addition go in, so worry if the > HDFS-630 patch would be useful long term. Maybe its a more fundamental > issue: where does the decision making go, into the clients or into the NN? > > -steve > > > > [1] http://www.hpl.hp.com/techreports/2009/HPL-2009-345.html