What is slow? You have read http://hbase.apache.org/book/performance.html?
The default splitter makes as many tasks as there regions in your table. If you want more, then split your table. Why you have a max filesize of 10G? What brought that on? This line makes it so you flush to hbase each Put: > hTable.flushCommits(); Try disabling it and let hbase manage flushing (Add a flushCommitts to the map close() method). You should consider bulk loader: http://hbase.apache.org/bulk-loads.html St.Ack On Wed, Jun 1, 2011 at 9:24 AM, byambajargal <[email protected]> wrote: > Hello everybody > > I have run a cluster with 11 nodes hbase CDH3u0 and i have 3 > zookeeper server in my cluster It seems very slowly when i run the job that > import > text file into hbase table my question is what is the recommended > configuration in hbase to write big data around 17GB > into Hbase table. Where i run the job it launched map task is only 20 it > could be 100 or more. > i have attached the hbase-site.xml file if someone knows it please help me > > here is the map function of my job: > > public void map(LongWritable key, Text value, OutputCollector<TextPair, > Text> output, Reporter reporter) throws IOException { > > > String line = value.toString(); > > //System.out.println("[read line ]"+line); > > if(line !=null&& !line.isEmpty()){ > > String[] items = line.split("\\,"); > > String concept_id = items[1]; > String element_id = items[0]; > //System.out.println("[Concept > id]:"+concept_id); > Put put = new Put(Bytes.toBytes(concept_id)); > > //keys of ELEMENT_* column families are > element id > put.add(Constant.COLUMN_ELEMENT_ID, > Bytes.toBytes(element_id),Bytes.toBytes(items[0])); > > put.add(Constant.COLUMN_ELEMENT_CONTEXT_ID,Bytes.toBytes(element_id),Bytes.toBytes(items[2])); > > put.add(Constant.COLUMN_ELEMENT_POSITION_FORM, > Bytes.toBytes(element_id),Bytes.toBytes(items[3])); > put.add(Constant.COLUMN_ELEMENT_POSITION_TO, > Bytes.toBytes(element_id),Bytes.toBytes(items[4])); > put.add(Constant.COLUMN_ELEMENT_TERM_ID, > Bytes.toBytes(element_id),Bytes.toBytes(items[5])); > > put.add(Constant.COLUMN_ELEMENT_DICTIONARY_ID, > Bytes.toBytes(element_id),Bytes.toBytes(items[6])); > > put.add(Constant.COLUMN_ELEMENT_WORKFLOW_STATUS, > Bytes.toBytes(element_id),Bytes.toBytes(items[7])); > hTable.put(put); > hTable.setAutoFlush(true); > hTable.flushCommits(); > //output.collect(new > TextPair(items[1],"1"),new Text(items[0]+items[1])); > > //System.out.println("[key value]"+ > concept_id+" : "+line ); > > } > //====================================================================================== > here is the configuration file of hbase: > > <property> > <name>hbase.zookeeper.property.maxClientCnxns</name> > <value>1000</value> > </property> > <property> > <name>hbase.hregion.max.filesize</name> > <value>1073741824</value> > </property> > <property> > <name>hbase.regionserver.handler.count</name> > <value>200</value> > </property> > > > <property> > <name>dfs.datanode.max.xcievers</name> > <value>4096</value> > </property> > > <property> > <name>hfile.block.cache.size</name> > <value>0.4</value> > </property> > <property> > <name>hbase.client.scanner.caching</name> > <value>100000</value> > </property> > > <property> > <name>hbase.zookeeper.quorum</name> > <value>server1,serve3,server5</value> > </property> > > > > cheers > > > Byambajargal > > >
