On 06/08/2011 02:25 PM, Brian Bockelman wrote:
Hi,

Well, I feel compelled to ask: what are folks opinions about the IPv6-readiness 
of Hadoop?

Doing some searching around, I think it's stated up-front that Hadoop is not 
IPv6 ready, and there's little interest in IPv6 support as Hadoop is meant to 
be within the data center (which I suppose stays on IPv4?).  This leads to two 
questions:

1) How "un-ready" is the stack for IPv6?  Is it a matter of sending a committer 
into a dark room for 6 months?  Or are there protocol issues?
2) How strongly do folks feel about not even attempting IPv4?

Alright - everyone go to work, ponder these questions, and attempt to get an 
IPv6 address.


funnily enough, I'm just forcing IPv6 off on multiple Java processes on a new RHEL6 machine, which I found deeply ironic
http://jira.smartfrog.org/jira/browse/SFOS-1601

I did some of the (hostname, IP address) caching code, primarily to get it working at home on a mess of a network (netgear aren't the recommended vendors of hadoop ToR switches). It doesn't like IPv4 there, but that's just an echo of the lack of IPv6 support.

Places you need to play
 -Jetty. Works w/ IPv6
 -RPC mechanisms
 -DFS

I wouldn't send someone into a dark room for 6 mo, though it is something that could/should be done on a branch by whoever is interested, and it will complicate testing. How big is the IPv6 cluster you have for people to test on, Brian?

What would make sense would be to slowly add it it up: RPC, HDFS, then MR, layers on top (Hive, HBase, etc), and see what performance hit you get by going to IPv6. Addresses get bigger, if they are cached in the NN and JT that could limit cluster size.

The big question is: do we need IPv6? I've just turned it off in my code, and most datacentres will only need a set of IPv4 addresses. Where IPv6 matters is when the organisation is going to IPv6, and they want to provide external access to the whole FS.

Reply via email to