Hi all, Why hadoop jobs need setup and cleanup phases which would consume a lot of time ? Why could not us archieve it like a distributed RDBMS does a master process coordinates all salve nodes through socket. I think that will save plenty of time if there won't be any setups and cleanups. What's hadoop philosophy on this?
Thanks, Min -- My research interests are distributed systems, parallel computing and bytecode based virtual machine. My profile: http://www.linkedin.com/in/coderplay My blog: http://coderplay.javaeye.com