Hello, I set up my one node pseudo distributed system, left with a cronjob, copying data from a remote server and loading them to hadoop, and doing some calculations per hour.
It stopped working today, giving me this error. I deleted everything, and made it reprocess from beginning, and i still get the same error same place. is there a limit, on how many partitions there can be in a table? so, I tried for couple of hours solving the problem, but now my hive fun is over... any ideas as to why this might be happening, or what should i do tring to debug it. best regards, -c.b. 11/02/12 01:27:47 INFO ql.Driver: Starting command: load data local inpath '/var/mylog/hourly/log.CAT.2011021119' into table cat_raw partition(date_hour=2011021119) Copying data from file:/var/mylog/hourly/log.CAT.2011021119 11/02/12 01:27:47 INFO exec.CopyTask: Copying data from file:/var/mylog/hourly/log.CAT.2011021119 to hdfs://darkstar:9000/tmp/hive-cam/hive_2011-02-12_01-27-47_415_7165217842693560517/10000 11/02/12 01:27:47 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException 11/02/12 01:27:47 INFO hdfs.DFSClient: Abandoning block blk_6275225343572661963_1859 11/02/12 01:27:53 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException 11/02/12 01:27:53 INFO hdfs.DFSClient: Abandoning block blk_2673116090916206836_1859 11/02/12 01:27:59 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException 11/02/12 01:27:59 INFO hdfs.DFSClient: Abandoning block blk_5414825878079983460_1859 11/02/12 01:28:05 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException 11/02/12 01:28:05 INFO hdfs.DFSClient: Abandoning block blk_6043862611357349730_1859 11/02/12 01:28:11 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) 11/02/12 01:28:11 WARN hdfs.DFSClient: Error Recovery for block blk_6043862611357349730_1859 bad datanode[0] nodes == null 11/02/12 01:28:11 WARN hdfs.DFSClient: Could not get block locations. Source file "/tmp/hive-cam/hive_2011-02-12_01-27-47_415_7165217842693560517/10000/log.CAT.2011021119" - Aborting... Failed with exception null 11/02/12 01:28:11 ERROR exec.CopyTask: Failed with exception null java.io.EOFException at java.io.DataInputStream.readByte(DataInputStream.java:250) at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298) at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319) at org.apache.hadoop.io.Text.readString(Text.java:400) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2901) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)