Hi Matt, in my opinion, the basic difference between MapReduce V1 and V2 is not about mapred or mapreduce API package, but about the platform to run the job. When it was MapReduce V1, the job was managed by JobTracker and TaskTracker. After upgrading to MapReduce V2, the resource management part in the MapReduce project has been spun off, and evolves to ben YARN, a generic distributed resource management system. MapReduce as well as other types of applications can run on the common platform. On the other side, the remaining part, which is code base of MapReduce V2, is a pure distributed computation framework.
With regard to the API packages, both mapred.* and mapreduce.* have been existing since MapReduce V1, but mapreduce.* has been involving a lot. If you're writing a new MapReduce application referring to the latest Hadoop libraries, it's MapReduce V2 no matter whether you're using mapred.* or mapreduce.*. If you already has some MapReduce applications that were built with MapReduce V1 framework, and use mapred.* APIs, they are supposed to be run on YARN without problems. However, it those applications use mapreduce.* APIs, you may need to compile them MapReduce V2 framework to be able to run them on YARN. Here're a bunch of resources that you may want to have a look for further information: http://hortonworks.com/hadoop/yarn/ http://hortonworks.com/blog/running-existing-applications-on-hadoop-2-yarn/ http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/ Thanks, Zhijie On Fri, Jan 3, 2014 at 2:19 AM, Matt Fellows < matt.fell...@bespokesoftware.com> wrote: > I'm thoroughly confused about which API is the recent one, which is the > old one and which method I should be using to write MapReduce applications. > > I'm under the impression that MRv2 is primarily driven by the > org.apache.hadoop.mapreduce.* packages and MRv1 is primarily driven by the > org.apache.hadoop.mapred.* packages. > > I've been led to believe that MRv2 applications extend MapReduceBase and > implement Mapper, Reducer etc. > and conversely the MRv1 applications extend Mapper, Reducer directly. > > However I can not find a canonical statement to back any of this up. > What's more I keep finding conflicting statements about these, such as > "'Hadoop - the definitive guide' gives example in MRv2 format" but then I > look at the examples and they use org.apache.hadoop.mapreduce.* packages, > but extend Mapper and extend Reducer, not MapReduceBase... > > Can someone either point me at a canonical resource or just confirm / deny > my assumptions? > > Kind regards > > -- > [image: cid:1CBF4038-3F0F-4FC2-A1FF-6DC81B8B6F94] > First Option Software Ltd > Signal House > Jacklyns Lane > Alresford > SO24 9JJ > Tel: +44 (0)1962 738232 > Mob: +44 (0)7710 160458 > Fax: +44 (0)1962 600112 > Web: www.b > <http://www.fosolutions.co.uk/>espokesoftware.com<http://bespokesoftware.com/> > > ____________________________________________________ > > This is confidential, non-binding and not company endorsed - see full > terms at www.fosolutions.co.uk/emailpolicy.html > First Option Software Ltd Registered No. 06340261 > Signal House, Jacklyns Lane, Alresford, Hampshire, SO24 9JJ, U.K. > ____________________________________________________ > > -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.