Thanks Rasit for your suggestion. Actually, I should have let the group know earlier that I solved the problem and it had nothing to do with the reduce method. I used my reducer class as the combiner too which is not appropriate in this case. I just got rid of the combiner and everything works fine now. I think the Map/Reduce tutorial in hadoop's website should talk more about the combiner. In the word count example the reducer can work as a combiner but not in all other problems. This should be highlighted a little bit more in the tutorial.
On Thu, Apr 2, 2009 at 8:50 AM, Rasit OZDAS <[email protected]> wrote: > Hi, Husain, > > 1. You can use a boolean control in your code. > boolean hasAlreadyOned = false; > int iCount = 0; > String sValue; > while (values.hasNext()) { > sValue = values.next().toString(); > iCount++; > if (sValue.equals("1")) > hasAlreadyOned = true; > > if (!hasAlreadyOned) > sValues += "\t" + sValue; > } > ... > > 2. You're actually controlling for 3 elements, not 2. You should use if > (iCount == 1) > > 2009/4/1 Farhan Husain <[email protected]> > > > Hello All, > > > > I am facing some problems with a reduce method I have written which I > > cannot > > understand. Here is the method: > > > > @Override > > public void reduce(Text key, Iterator<Text> values, > > OutputCollector<Text, Text> output, Reporter reporter) > > throws IOException { > > String sValues = ""; > > int iCount = 0; > > String sValue; > > while (values.hasNext()) { > > sValue = values.next().toString(); > > iCount++; > > sValues += "\t" + sValue; > > > > } > > sValues += "\t" + iCount; > > //if (iCount == 2) > > output.collect(key, new Text(sValues)); > > } > > > > The output of the code is like the following: > > > > D0U0:GraduateStudent0 lehigh:GraduateStudent 1 1 > 1 > > D0U0:GraduateStudent1 lehigh:GraduateStudent 1 1 > 1 > > D0U0:GraduateStudent10 lehigh:GraduateStudent 1 1 > 1 > > D0U0:GraduateStudent100 lehigh:GraduateStudent 1 1 > > 1 > > D0U0:GraduateStudent101 lehigh:GraduateStudent 1 > > D0U0:GraduateCourse0 1 2 1 > > D0U0:GraduateStudent102 lehigh:GraduateStudent 1 1 > > 1 > > D0U0:GraduateStudent103 lehigh:GraduateStudent 1 1 > > 1 > > D0U0:GraduateStudent104 lehigh:GraduateStudent 1 1 > > 1 > > D0U0:GraduateStudent105 lehigh:GraduateStudent 1 1 > > 1 > > > > The problem is there cannot be so many 1's in the output value. The > output > > which I expect should be like this: > > > > D0U0:GraduateStudent0 lehigh:GraduateStudent 1 > > D0U0:GraduateStudent1 lehigh:GraduateStudent 1 > > D0U0:GraduateStudent10 lehigh:GraduateStudent 1 > > D0U0:GraduateStudent100 lehigh:GraduateStudent 1 > > D0U0:GraduateStudent101 lehigh:GraduateStudent > > D0U0:GraduateCourse0 2 > > D0U0:GraduateStudent102 lehigh:GraduateStudent 1 > > D0U0:GraduateStudent103 lehigh:GraduateStudent 1 > > D0U0:GraduateStudent104 lehigh:GraduateStudent 1 > > D0U0:GraduateStudent105 lehigh:GraduateStudent 1 > > > > If I do not append the iCount variable to sValues string, I get the > > following output: > > > > D0U0:GraduateStudent0 lehigh:GraduateStudent > > D0U0:GraduateStudent1 lehigh:GraduateStudent > > D0U0:GraduateStudent10 lehigh:GraduateStudent > > D0U0:GraduateStudent100 lehigh:GraduateStudent > > D0U0:GraduateStudent101 lehigh:GraduateStudent > > D0U0:GraduateCourse0 > > D0U0:GraduateStudent102 lehigh:GraduateStudent > > D0U0:GraduateStudent103 lehigh:GraduateStudent > > D0U0:GraduateStudent104 lehigh:GraduateStudent > > D0U0:GraduateStudent105 lehigh:GraduateStudent > > > > This confirms that there is no 1's after each of those values (which I > > already know from the intput data). I do not know why the output is > > distorted like that when I append the iCount to sValues (like the given > > code). Can anyone help in this regard? > > > > Now comes the second problem which is equally perplexing. Actually, the > > reduce method which I want to run is like the following: > > > > @Override > > public void reduce(Text key, Iterator<Text> values, > > OutputCollector<Text, Text> output, Reporter reporter) > > throws IOException { > > String sValues = ""; > > int iCount = 0; > > String sValue; > > while (values.hasNext()) { > > sValue = values.next().toString(); > > iCount++; > > sValues += "\t" + sValue; > > > > } > > sValues += "\t" + iCount; > > if (iCount == 2) > > output.collect(key, new Text(sValues)); > > } > > > > I want to output only if "values" contained only two elements. By looking > > at > > the output above you can see that there is at least one such key values > > pair > > where values have exactly two elements. But when I run the code I get an > > empty output file. Can anyone solve this? > > > > I have tried many versions of the code (e.g. using StringBuffer instead > of > > String, using flags instead of integer count) but nothing works. Are > these > > problems due to bugs in Hadoop? Please let me know any kind of solution > you > > can think of. > > > > Thanks, > > > > -- > > Mohammad Farhan Husain > > Research Assistant > > Department of Computer Science > > Erik Jonsson School of Engineering and Computer Science > > University of Texas at Dallas > > > > > > -- > M. Raşit ÖZDAŞ > -- Mohammad Farhan Husain Research Assistant Department of Computer Science Erik Jonsson School of Engineering and Computer Science University of Texas at Dallas
