Here's whats happening in the logs... I get these messages pretty often: 2009-02-21 10:47:27,252 INFO org.apache.hadoop.hdfs.DFSClient: Could not complete file /hbase/in_table/compaction.dir/29712919/b2b/mapfiles/6353513045069085254/data retrying...
Sometimes I get these too: 2009-02-21 10:48:46,273 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for 'IPC Server handler 5 on 60020' on region in_table,,1235241411727: Memcache size 128.0m is >= than blocking 128.0m size Here's what it logs when the job starts to fail: 2009-02-21 10:50:52,510 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.net.SocketTimeoutException: 5000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/171.69.102.51:8270 remote=/ 171.69.102.51:50010] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:162) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) at java.io.BufferedOutputStream.write(Unknown Source) at java.io.DataOutputStream.write(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2209) 2009-02-21 10:50:52,511 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush of ~64.0m for region in_table,,1235241411727 in 5144ms, sequence id=30842181, compaction requested=true 2009-02-21 10:50:52,511 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction requested for region in_table,,1235241411727/29712919 because: regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher 2009-02-21 10:50:52,512 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-2896903198415069285_18306 bad datanode[0] 171.69.102.51:50010 2009-02-21 10:50:52,513 FATAL org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed with ioe: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:52,513 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:52,515 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... 2009-02-21 10:50:52,515 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:52,515 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: request=11, regions=2, stores=33, storefiles=167, storefileIndexSize=0, memcacheSize=1, usedHeap=156, maxHeap=963 2009-02-21 10:50:52,515 INFO org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting. 2009-02-21 10:50:52,516 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:52,516 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... 2009-02-21 10:50:52,516 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... 2009-02-21 10:50:52,516 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:52,516 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... 2009-02-21 10:50:52,517 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60020, call batchUpdates([...@cfdbc2, [Lorg.apache.hadoop.hbase.io.BatchUpdate;@64c0d9) from 171.69.102.51:8468: error: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:52,517 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60020, call batchUpdates([...@b0fc4d, [Lorg.apache.hadoop.hbase.io.BatchUpdate;@184425c) from 171.69.102.51:8469: error: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:52,518 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020, call batchUpdates([...@20a9de, [Lorg.apache.hadoop.hbase.io.BatchUpdate;@706d7c) from 171.69.102.52:9279: error: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:52,518 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60020, call batchUpdates([...@1240afe, [Lorg.apache.hadoop.hbase.io.BatchUpdate;@14dc2e6) from 171.69.102.52:9280: error: java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:53,049 DEBUG org.apache.hadoop.hbase.RegionHistorian: Offlined 2009-02-21 10:50:53,050 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 60020 2009-02-21 10:50:53,050 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60020: exiting 2009-02-21 10:50:53,050 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server listener on 60020 2009-02-21 10:50:53,051 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer 2009-02-21 10:50:53,051 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC Server Responder 2009-02-21 10:50:53,052 INFO org.mortbay.util.ThreadedServer: Stopping Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030] 2009-02-21 10:50:53,052 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60020: exiting 2009-02-21 10:50:53,052 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60020: exiting 2009-02-21 10:50:53,052 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60020: exiting 2009-02-21 10:50:53,052 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60020: exiting 2009-02-21 10:50:53,052 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60020: exiting 2009-02-21 10:50:53,052 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60020: exiting 2009-02-21 10:50:53,053 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 4 on 60020: exiting 2009-02-21 10:50:53,053 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60020: exiting 2009-02-21 10:50:53,053 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 8 on 60020: exiting 2009-02-21 10:50:54,503 INFO org.mortbay.http.SocketListener: Stopped SocketListener on 0.0.0.0:60030 2009-02-21 10:50:54,670 INFO org.mortbay.util.Container: Stopped HttpContext[/logs,/logs] 2009-02-21 10:50:54,671 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.servlet.webapplicationhand...@c3c315 2009-02-21 10:50:54,771 INFO org.mortbay.util.Container: Stopped WebApplicationContext[/static,/static] 2009-02-21 10:50:54,772 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.servlet.webapplicationhand...@aae86e 2009-02-21 10:50:54,893 INFO org.mortbay.util.Container: Stopped WebApplicationContext[/,/] 2009-02-21 10:50:54,893 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.ser...@1f3ce5c 2009-02-21 10:50:54,893 DEBUG org.apache.hadoop.hbase.regionserver.HLog: closing log writer in hdfs://rndpc0:9000/hbase/log_171.69.102.51_1235215460389_60020 2009-02-21 10:50:54,893 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to close log in abort java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) 2009-02-21 10:50:54,893 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: closing region in_table,,1235241411727 2009-02-21 10:50:54,893 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing in_table,,1235241411727: compactions & flushes disabled 2009-02-21 10:50:54,893 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: waiting for compaction to complete for region in_table,,1235241411727 2009-02-21 10:50:54,893 INFO org.apache.hadoop.hbase.regionserver.MemcacheFlusher: regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher exiting 2009-02-21 10:50:54,894 INFO org.apache.hadoop.hbase.regionserver.LogFlusher: regionserver/0:0:0:0:0:0:0:0:60020.logFlusher exiting 2009-02-21 10:50:54,894 INFO org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: regionserver/0:0:0:0:0:0:0:0:60020.majorCompactionChecker exiting 2009-02-21 10:50:57,251 INFO org.apache.hadoop.hbase.Leases: regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker closing leases2009-02-21 10:50:57,251 INFO org.apache.hadoop.hbase.Leases: regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker closed leases 2009-02-21 10:51:01,199 DEBUG org.apache.hadoop.hbase.regionserver.HStore: moving /hbase/in_table/compaction.dir/29712919/tac_product_hw_key/mapfiles/6810647399799866363 to /hbase/in_table/29712919/tac_product_hw_key/mapfiles/4524637696729699312 2009-02-21 10:51:01,943 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting 2009-02-21 10:51:02,587 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 29712919/tac_product_hw_key store size is 17.0m 2009-02-21 10:51:02,828 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Compaction size of 29712919/summary: 23.9m; Skipped 1 file(s), size: 10737371 2009-02-21 10:51:03,896 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Started compaction of 7 file(s) into /hbase/in_table/compaction.dir/29712919/summary/mapfiles/3985892544092790173 and then followed by: 2009-02-21 10:50:54,503 INFO org.mortbay.http.SocketListener: Stopped SocketListener on 0.0.0.0:60030 2009-02-21 10:50:54,670 INFO org.mortbay.util.Container: Stopped HttpContext[/logs,/logs] 2009-02-21 10:50:54,671 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.servlet.webapplicationhand...@c3c315 2009-02-21 10:50:54,771 INFO org.mortbay.util.Container: Stopped WebApplicationContext[/static,/static] 2009-02-21 10:50:54,772 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.servlet.webapplicationhand...@aae86e 2009-02-21 10:50:54,893 INFO org.mortbay.util.Container: Stopped WebApplicationContext[/,/] 2009-02-21 10:50:54,893 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.ser...@1f3ce5c 2009-02-21 10:50:54,893 DEBUG org.apache.hadoop.hbase.regionserver.HLog: closing log writer in hdfs://rndpc0:9000/hbase/log_171.69.102.51_1235215460389_60020 2009-02-21 10:50:54,893 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to close log in abort java.io.IOException: All datanodes 171.69.102.51:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2442) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:1997) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2160) Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz On Sat, Feb 21, 2009 at 3:17 AM, Amandeep Khurana <ama...@gmail.com> wrote: > I changed the config and restarted the cluster. This time the job went upto > 16% and the same problem started. > > I'll do the stuff with the logs now and see what comes out. > > > Amandeep Khurana > Computer Science Graduate Student > University of California, Santa Cruz > > > On Sat, Feb 21, 2009 at 3:08 AM, Ryan Rawson <ryano...@gmail.com> wrote: > >> you have to change hadoop-site.xml and restart HDFS. >> >> you should also change the logging to be more verbose in hbase - check out >> the hbase FAQ (link missing -ed). >> >> if you get the problem again, peruse the hbase logs and post what is going >> on there. the client errors dont really include the root cause on the >> regionserver side. >> >> good luck, >> -ryan >> >> >> On Sat, Feb 21, 2009 at 2:21 AM, Amandeep Khurana <ama...@gmail.com> >> wrote: >> >> > I have 1 master + 2 slaves. I did set the timout to zero. I'll set the >> > xceivers to 2047 and try again. Can this be done in the job config or >> does >> > the site.xml need to be changed and the cluster restarted? >> > >> > Amandeep >> > >> > >> > Amandeep Khurana >> > Computer Science Graduate Student >> > University of California, Santa Cruz >> > >> > >> > On Sat, Feb 21, 2009 at 2:16 AM, Ryan Rawson <ryano...@gmail.com> >> wrote: >> > >> > > So the usual suspects are: >> > > >> > > - xcievers (i hvae mine set to 2047) >> > > - timeout (i have mine set to 0) >> > > >> > > I can import a few hundred million records with these settings. >> > > >> > > how many nodes do you have again? >> > > >> > > On Sat, Feb 21, 2009 at 2:14 AM, Amandeep Khurana <ama...@gmail.com> >> > > wrote: >> > > >> > > > Yes, I noticed it this time. The regionserver gets slow or stops >> > > responding >> > > > and then this error comes. How do I get this to work? Is there a way >> of >> > > > limiting the resources that the map red job should take? >> > > > >> > > > I did make the changes in the config site similar to Larry Comton's >> > > config. >> > > > It only made the job go from dying at 7% to 12% this time. >> > > > >> > > > Amandeep >> > > > >> > > > >> > > > Amandeep Khurana >> > > > Computer Science Graduate Student >> > > > University of California, Santa Cruz >> > > > >> > > > >> > > > On Sat, Feb 21, 2009 at 1:14 AM, stack <st...@duboce.net> wrote: >> > > > >> > > > > It looks like regionserver hosting root crashed: >> > > > > >> > > > > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed >> out >> > > > trying >> > > > > to locate root region >> > > > > >> > > > > How many servers you running? >> > > > > >> > > > > You made similar config. to that reported by Larry Compton in a >> mail >> > > from >> > > > > earlier today? (See FAQ and Troubleshooting page for more on his >> > > listed >> > > > > configs.) >> > > > > >> > > > > St.Ack >> > > > > >> > > > > >> > > > > On Sat, Feb 21, 2009 at 1:01 AM, Amandeep Khurana < >> ama...@gmail.com> >> > > > > wrote: >> > > > > >> > > > > > Yes, the table exists before I start the job. >> > > > > > >> > > > > > I am not using TableOutputFormat. I picked up the sample code >> from >> > > the >> > > > > docs >> > > > > > and am using it. >> > > > > > >> > > > > > Here's the job conf: >> > > > > > >> > > > > > JobConf conf = new JobConf(getConf(), IN_TABLE_IMPORT.class); >> > > > > > FileInputFormat.setInputPaths(conf, new >> > Path("import_data")); >> > > > > > conf.setMapperClass(MapClass.class); >> > > > > > conf.setNumReduceTasks(0); >> > > > > > conf.setOutputFormat(NullOutputFormat.class); >> > > > > > JobClient.runJob(conf); >> > > > > > >> > > > > > Interestingly, the hbase shell isnt working now either. Its >> giving >> > > > errors >> > > > > > even when I give the command "list"... >> > > > > > >> > > > > > >> > > > > > >> > > > > > Amandeep Khurana >> > > > > > Computer Science Graduate Student >> > > > > > University of California, Santa Cruz >> > > > > > >> > > > > > >> > > > > > On Sat, Feb 21, 2009 at 12:10 AM, stack <st...@duboce.net> >> wrote: >> > > > > > >> > > > > > > The table exists before you start the MR job? >> > > > > > > >> > > > > > > When you say 'midway through the job', are you using >> > > > tableoutputformat >> > > > > to >> > > > > > > insert into your table? >> > > > > > > >> > > > > > > Which version of hbase? >> > > > > > > >> > > > > > > St.Ack >> > > > > > > >> > > > > > > On Fri, Feb 20, 2009 at 9:55 PM, Amandeep Khurana < >> > > ama...@gmail.com> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > I dont know if this is related or not, but it seems to be. >> > After >> > > > this >> > > > > > map >> > > > > > > > reduce job, I tried to count the number of entries in the >> table >> > > in >> > > > > > hbase >> > > > > > > > through the shell. It failed with the following error: >> > > > > > > > >> > > > > > > > hbase(main):002:0> count 'in_table' >> > > > > > > > NativeException: java.lang.NullPointerException: null >> > > > > > > > from java.lang.String:-1:in `<init>' >> > > > > > > > from org/apache/hadoop/hbase/util/Bytes.java:92:in >> > `toString' >> > > > > > > > from >> > > > > > > >> > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:50:in >> > > > > > > > `getMessage' >> > > > > > > > from >> > > > > > > >> > org/apache/hadoop/hbase/client/RetriesExhaustedException.java:40:in >> > > > > > > > `<init>' >> > > > > > > > from >> > > > org/apache/hadoop/hbase/client/HConnectionManager.java:841:in >> > > > > > > > `getRegionServerWithRetries' >> > > > > > > > from >> org/apache/hadoop/hbase/client/MetaScanner.java:56:in >> > > > > > `metaScan' >> > > > > > > > from >> org/apache/hadoop/hbase/client/MetaScanner.java:30:in >> > > > > > `metaScan' >> > > > > > > > from >> > > > org/apache/hadoop/hbase/client/HConnectionManager.java:411:in >> > > > > > > > `getHTableDescriptor' >> > > > > > > > from org/apache/hadoop/hbase/client/HTable.java:219:in >> > > > > > > > `getTableDescriptor' >> > > > > > > > from sun.reflect.NativeMethodAccessorImpl:-2:in `invoke0' >> > > > > > > > from sun.reflect.NativeMethodAccessorImpl:-1:in `invoke' >> > > > > > > > from sun.reflect.DelegatingMethodAccessorImpl:-1:in >> `invoke' >> > > > > > > > from java.lang.reflect.Method:-1:in `invoke' >> > > > > > > > from org/jruby/javasupport/JavaMethod.java:250:in >> > > > > > > > `invokeWithExceptionHandling' >> > > > > > > > from org/jruby/javasupport/JavaMethod.java:219:in >> `invoke' >> > > > > > > > from org/jruby/javasupport/JavaClass.java:416:in >> `execute' >> > > > > > > > ... 145 levels... >> > > > > > > > from >> > > org/jruby/internal/runtime/methods/DynamicMethod.java:74:in >> > > > > > > `call' >> > > > > > > > from >> > > > org/jruby/internal/runtime/methods/CompiledMethod.java:48:in >> > > > > > > `call' >> > > > > > > > from org/jruby/runtime/CallSite.java:123:in >> `cacheAndCall' >> > > > > > > > from org/jruby/runtime/CallSite.java:298:in `call' >> > > > > > > > from >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:429:in >> > > > > > > > `__file__' >> > > > > > > > from >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in >> > > > > > > > `__file__' >> > > > > > > > from >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ruby/hadoop/install/hbase_minus_0_dot_19_dot_0/bin//hadoop/install/hbase/bin/../bin/hirb.rb:-1:in >> > > > > > > > `load' >> > > > > > > > from org/jruby/Ruby.java:512:in `runScript' >> > > > > > > > from org/jruby/Ruby.java:432:in `runNormally' >> > > > > > > > from org/jruby/Ruby.java:312:in `runFromMain' >> > > > > > > > from org/jruby/Main.java:144:in `run' >> > > > > > > > from org/jruby/Main.java:89:in `run' >> > > > > > > > from org/jruby/Main.java:80:in `main' >> > > > > > > > from /hadoop/install/hbase/bin/../bin/HBase.rb:444:in >> > `count' >> > > > > > > > from /hadoop/install/hbase/bin/../bin/hirb.rb:348:in >> `count' >> > > > > > > > from (hbase):3:in `binding' >> > > > > > > > >> > > > > > > > >> > > > > > > > Amandeep Khurana >> > > > > > > > Computer Science Graduate Student >> > > > > > > > University of California, Santa Cruz >> > > > > > > > >> > > > > > > > >> > > > > > > > On Fri, Feb 20, 2009 at 9:46 PM, Amandeep Khurana < >> > > > ama...@gmail.com> >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > Here's what it throws on the console: >> > > > > > > > > >> > > > > > > > > 09/02/20 21:45:29 INFO mapred.JobClient: Task Id : >> > > > > > > > > attempt_200902201300_0019_m_000006_0, Status : FAILED >> > > > > > > > > java.io.IOException: table is null >> > > > > > > > > at >> > > IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:33) >> > > > > > > > > at >> > IN_TABLE_IMPORT$MapClass.map(IN_TABLE_IMPORT.java:1) >> > > > > > > > > at >> > > > > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) >> > > > > > > > > at >> > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) >> > > > > > > > > at >> > org.apache.hadoop.mapred.Child.main(Child.java:155) >> > > > > > > > > >> > > > > > > > > attempt_200902201300_0019_m_000006_0: >> > > > > > > > > org.apache.hadoop.hbase.client.NoServerForRegionException: >> > > Timed >> > > > > out >> > > > > > > > trying >> > > > > > > > > to locate root region >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:768) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:448) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:430) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:557) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:457) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:430) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:557) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:461) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:423) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:114) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:97) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> IN_TABLE_IMPORT$MapClass.configure(IN_TABLE_IMPORT.java:120) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > >> > > >> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > >> > > >> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) >> > > > > > > > > attempt_200902201300_0019_m_000006_0: at >> > > > > > > > > org.apache.hadoop.mapred.Child.main(Child.java:155) >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > Amandeep Khurana >> > > > > > > > > Computer Science Graduate Student >> > > > > > > > > University of California, Santa Cruz >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > On Fri, Feb 20, 2009 at 9:43 PM, Amandeep Khurana < >> > > > > ama...@gmail.com >> > > > > > > > >wrote: >> > > > > > > > > >> > > > > > > > >> I am trying to import data from a flat file into Hbase >> using >> > a >> > > > Map >> > > > > > > > Reduce >> > > > > > > > >> job. There are close to 2 million rows. Mid way into the >> > job, >> > > it >> > > > > > > starts >> > > > > > > > >> giving me connection problems and eventually kills the >> job. >> > > When >> > > > > the >> > > > > > > > error >> > > > > > > > >> comes, the hbase shell also stops working. >> > > > > > > > >> >> > > > > > > > >> This is what I get: >> > > > > > > > >> >> > > > > > > > >> 2009-02-20 21:37:14,407 INFO >> > org.apache.hadoop.ipc.HBaseClass: >> > > > > > > Retrying >> > > > > > > > connect to server: /171.69.102.52:60020. Already tried 0 >> > > time(s). >> > > > > > > > >> >> > > > > > > > >> What could be going wrong? >> > > > > > > > >> >> > > > > > > > >> Amandeep >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > > >> Amandeep Khurana >> > > > > > > > >> Computer Science Graduate Student >> > > > > > > > >> University of California, Santa Cruz >> > > > > > > > >> >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >