Hello All,

I am facing some problems with a reduce method I have written which I cannot
understand. Here is the method:

    @Override
    public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> output, Reporter reporter)
        throws IOException {
        String sValues = "";
        int iCount = 0;
        String sValue;
        while (values.hasNext()) {
            sValue = values.next().toString();
            iCount++;
            sValues += "\t" + sValue;

        }
        sValues += "\t" + iCount;
        //if (iCount == 2)
            output.collect(key, new Text(sValues));
    }

The output of the code is like the following:

D0U0:GraduateStudent0                lehigh:GraduateStudent    1    1    1
D0U0:GraduateStudent1                lehigh:GraduateStudent    1    1    1
D0U0:GraduateStudent10                lehigh:GraduateStudent    1    1    1
D0U0:GraduateStudent100                lehigh:GraduateStudent    1    1    1
D0U0:GraduateStudent101                lehigh:GraduateStudent    1
D0U0:GraduateCourse0    1    2    1
D0U0:GraduateStudent102                lehigh:GraduateStudent    1    1    1
D0U0:GraduateStudent103                lehigh:GraduateStudent    1    1    1
D0U0:GraduateStudent104                lehigh:GraduateStudent    1    1    1
D0U0:GraduateStudent105                lehigh:GraduateStudent    1    1    1

The problem is there cannot be so many 1's in the output value. The output
which I expect should be like this:

D0U0:GraduateStudent0                lehigh:GraduateStudent    1
D0U0:GraduateStudent1                lehigh:GraduateStudent    1
D0U0:GraduateStudent10                lehigh:GraduateStudent    1
D0U0:GraduateStudent100                lehigh:GraduateStudent    1
D0U0:GraduateStudent101                lehigh:GraduateStudent
D0U0:GraduateCourse0    2
D0U0:GraduateStudent102                lehigh:GraduateStudent    1
D0U0:GraduateStudent103                lehigh:GraduateStudent    1
D0U0:GraduateStudent104                lehigh:GraduateStudent    1
D0U0:GraduateStudent105                lehigh:GraduateStudent    1

If I do not append the iCount variable to sValues string, I get the
following output:

D0U0:GraduateStudent0                lehigh:GraduateStudent
D0U0:GraduateStudent1                lehigh:GraduateStudent
D0U0:GraduateStudent10                lehigh:GraduateStudent
D0U0:GraduateStudent100                lehigh:GraduateStudent
D0U0:GraduateStudent101                lehigh:GraduateStudent
D0U0:GraduateCourse0
D0U0:GraduateStudent102                lehigh:GraduateStudent
D0U0:GraduateStudent103                lehigh:GraduateStudent
D0U0:GraduateStudent104                lehigh:GraduateStudent
D0U0:GraduateStudent105                lehigh:GraduateStudent

This confirms that there is no 1's after each of those values (which I
already know from the intput data). I do not know why the output is
distorted like that when I append the iCount to sValues (like the given
code). Can anyone help in this regard?

Now comes the second problem which is equally perplexing. Actually, the
reduce method which I want to run is like the following:

    @Override
    public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> output, Reporter reporter)
        throws IOException {
        String sValues = "";
        int iCount = 0;
        String sValue;
        while (values.hasNext()) {
            sValue = values.next().toString();
            iCount++;
            sValues += "\t" + sValue;

        }
        sValues += "\t" + iCount;
        if (iCount == 2)
            output.collect(key, new Text(sValues));
    }

I want to output only if "values" contained only two elements. By looking at
the output above you can see that there is at least one such key values pair
where values have exactly two elements. But when I run the code I get an
empty output file. Can anyone solve this?

I have tried many versions of the code (e.g. using StringBuffer instead of
String, using flags instead of integer count) but nothing works. Are these
problems due to bugs in Hadoop? Please let me know any kind of solution you
can think of.

Thanks,

-- 
Mohammad Farhan Husain
Research Assistant
Department of Computer Science
Erik Jonsson School of Engineering and Computer Science
University of Texas at Dallas

Reply via email to