Hi Sara,
On the surface your change looks okay to me.  But it's hard say really.
It looks like the code expected to read more data. Perhaps you add some logging around the statements that failed and try to get a sense of how much and what data had been successfully read just prior to the failure.
Did you change anything else?  Maybe you could post the diffs.
Marty


On 02/28/2013 04:06 PM, Sara Del Río García wrote:
Hello all:

I'm testing the Random Forest Partial version in the version of Hadoop: Hadoop 2.0.0-cdh4.1.1

I'm trying to modify the algorithm, all I do is add more information to the leaves of the tree. Currently containing the label and I want to add another label more:

@Override
public void readFields(DataInput in) throws IOException{

label = in.readDouble();
leafWeight = in.readDouble();

}

@Override
protected void writeNode(DataOutput out) throws IOException{

out.writeDouble(label);
out.writeDouble(leafWeight);

}

And I get the following error:

13/02/27 06:53:27 INFO mapreduce.BuildForest: Partial Mapred implementation
13/02/27 06:53:27 INFO mapreduce.BuildForest: Building the forest...
13/02/27 06:53:27 INFO mapreduce.BuildForest: Weights Estimation: IR
13/02/27 06:53:37 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 13/02/27 06:53:39 INFO input.FileInputFormat: Total input paths to process : 1 13/02/27 06:53:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 13/02/27 06:53:39 WARN snappy.LoadSnappy: Snappy native library not loaded 13/02/27 06:53:39 INFO mapred.JobClient: Running job: job_201302270205_0013
13/02/27 06:53:40 INFO mapred.JobClient: map 0% reduce 0%
13/02/27 06:54:18 INFO mapred.JobClient: map 20% reduce 0%
13/02/27 06:54:42 INFO mapred.JobClient: map 40% reduce 0%
13/02/27 06:55:03 INFO mapred.JobClient: map 60% reduce 0%
13/02/27 06:55:26 INFO mapred.JobClient: map 70% reduce 0%
13/02/27 06:55:27 INFO mapred.JobClient: map 80% reduce 0%
13/02/27 06:55:49 INFO mapred.JobClient: map 100% reduce 0%
13/02/27 06:56:04 INFO mapred.JobClient: Job complete: job_201302270205_0013
13/02/27 06:56:04 INFO mapred.JobClient: Counters: 24
13/02/27 06:56:04 INFO mapred.JobClient: File System Counters
13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of bytes read=0
13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of bytes written=1828230 13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of read operations=0 13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of large read operations=0 13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of write operations=0 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of bytes read=1381649 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of bytes written=1680 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of read operations=30 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of large read operations=0 13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of write operations=10
13/02/27 06:56:04 INFO mapred.JobClient: Job Counters
13/02/27 06:56:04 INFO mapred.JobClient: Launched map tasks=10
13/02/27 06:56:04 INFO mapred.JobClient: Data-local map tasks=10
13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=254707 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
13/02/27 06:56:04 INFO mapred.JobClient: Map-Reduce Framework
13/02/27 06:56:04 INFO mapred.JobClient: Map input records=20
13/02/27 06:56:04 INFO mapred.JobClient: Map output records=10
13/02/27 06:56:04 INFO mapred.JobClient: Input split bytes=1540
13/02/27 06:56:04 INFO mapred.JobClient: Spilled Records=0
13/02/27 06:56:04 INFO mapred.JobClient: CPU time spent (ms)=12070
13/02/27 06:56:04 INFO mapred.JobClient: Physical memory (bytes) snapshot=949579776 13/02/27 06:56:04 INFO mapred.JobClient: Virtual memory (bytes) snapshot=8412340224 13/02/27 06:56:04 INFO mapred.JobClient: Total committed heap usage (bytes)=478412800 Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:104) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.processOutput(PartialBuilder.java:129) at org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:96) at org.apache.mahout.classifier.df.mapreduce.Builder.build(Builder.java:312) at org.apache.mahout.classifier.df.mapreduce.BuildForest.buildForest(BuildForest.java:246) at org.apache.mahout.classifier.df.mapreduce.BuildForest.run(BuildForest.java:200)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.classifier.df.mapreduce.BuildForest.main(BuildForest.java:270)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at java.io.DataInputStream.readDouble(DataInputStream.java:451)
at org.apache.mahout.classifier.df.node.Leaf.readFields(Leaf.java:136)
at org.apache.mahout.classifier.df.node.Node.read(Node.java:85)
at org.apache.mahout.classifier.df.mapreduce.MapredOutput.readFields(MapredOutput.java:64) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2114)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2242)
at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
... 10 more

What's the problem?

You can try to write more information in the leaves of the tree?

Thank you very much.


Best regards,

Sara



Reply via email to