Hi,
I developed a simple "custom streaming" that permits to perform a mappers-only text processing without shuffling result due to key sorting. We successfully use it for semantical precessing on huge size of data at Pisa University and AFAIK at Bruno Kessler Foundation (http://www.fbk.eu ) for similar purposes.
You can find sources and documentation here: 
http://medialab.di.unipi.it/wiki/Hadoop_Streams
I'm posting here at your judgement because it seems to be an hadoop' lacking feature, and maybe could be an improvement for a future release.
Best regards,
--francesco

Reply via email to