How are you intending to verify the map output? It's only partially dumped to disk. None of the intermediate data goes into HDFS.
Daniel On Aug 25, 2016 4:10 PM, "xeon Mailinglist" <xeonmailingl...@gmail.com> wrote: > But then I need to set identity maps to run the reducers. If I suspend a > job after the maps finish, I don't need to set identity maps up. I want to > suspend a job so that I don't run identity maps and get better performance. > > On Aug 25, 2016 10:12 PM, "Haibo Chen" <haiboc...@cloudera.com> wrote: > > One thing you can try is to write a map-only job first and then verify the > map out. > > On Thu, Aug 25, 2016 at 1:18 PM, xeon Mailinglist < > xeonmailingl...@gmail.com > > wrote: > > > I am using Mapreduce v2. > > > > On Aug 25, 2016 8:18 PM, "xeon Mailinglist" <xeonmailingl...@gmail.com> > > wrote: > > > > > I am trying to implement a mechanism in MapReduce v2 that allows to > > > suspend and resume a job. I must suspend a job when all the mappers > > finish, > > > and resume the job from that point after some time. I do this, because > I > > > want to verify the integrity of the map output before executing the > > > reducers. > > > > > > I am looking for the class that tells when the Reduce tasks should > start. > > > Does anyone know where is this? > > > > > >