Hi All, I am using Hbase 1.1.1 ,I came across a post describing hbase-spark included in hbase core
I am trying to use HbaseContext but cnt find the appropriate lib while trying to add following in pim I am getting missing artifact error <groupId>Org.apache.hbase<groupId> <artifactId>Hbase<artifactId> <version>1.1.1<version> -----Original Message----- From: "Ted Yu" <yuzhih...@gmail.com> Sent: 20-10-2015 21:30 To: "Amit Hora" <hora.a...@gmail.com> Cc: "user" <user@spark.apache.org> Subject: Re: Spark opening to many connection with zookeeper I need to dig deeper into saveAsHadoopDataset to see what might have caused the effect you observed. Cheers On Tue, Oct 20, 2015 at 8:57 AM, Amit Hora <hora.a...@gmail.com> wrote: Hi Ted, I made mistake last time yes the connection are very controlled when I used put like iterated over rdd for each and within that for each partition made connection and executed put list for hbase But why it was that the connection were getting too much when I used hibconf and storehadoopdataset method? From: Amit Hora Sent: 20-10-2015 20:38 To: Ted Yu Cc: user Subject: RE: Spark opening to many connection with zookeeper I used that also but the number of connection goes on increasing started frm 10 and went till 299 Than I changed my zookeeper conf to set max client connection to just 30 and restarted job Now the connections are between 18- 24 from last 2 hours I am unable to understand such a behaviour From: Ted Yu Sent: 20-10-2015 20:19 To: Amit Hora Cc: user Subject: Re: Spark opening to many connection with zookeeper Can you take a look at example 37 on page 225 of: http://hbase.apache.org/apache_hbase_reference_guide.pdf You can use the following method of Table: void put(List<Put> puts) throws IOException; After the put() returns, the connection is closed. Cheers On Tue, Oct 20, 2015 at 2:40 AM, Amit Hora <hora.a...@gmail.com> wrote: One region From: Ted Yu Sent: 20-10-2015 15:01 To: Amit Singh Hora Cc: user Subject: Re: Spark opening to many connection with zookeeper How many regions do your table have ? Which hbase release do you use ? Cheers On Tue, Oct 20, 2015 at 12:32 AM, Amit Singh Hora <hora.a...@gmail.com> wrote: Hi All , My spark job started reporting zookeeper errors after seeing the zkdumps from Hbase master i realized that there are N number of connection being made from the nodes where worker of spark are running i believe some how the connections are not getting closed that is leading to error please find below code val conf = ConfigFactory.load("connection.conf").getConfig("connection") val hconf = HBaseConfiguration.create(); hconf.set(TableOutputFormat.OUTPUT_TABLE, conf.getString("hbase.tablename")) hconf.set("zookeeper.session.timeout", conf.getString("hbase.zookeepertimeout")); hconf.set("hbase.client.retries.number", Integer.toString(1)); hconf.set("zookeeper.recovery.retry", Integer.toString(1)); hconf.set("hbase.master", conf.getString("hbase.hbase_master")); hconf.set("hbase.zookeeper.quorum",conf.getString("hbase.hbase_zkquorum")); // zkquorum consists of 5 nodes hconf.set("zookeeper.znode.parent", "/hbase-unsecure"); hconf.set("hbase.zookeeper.property.clientPort", conf.getString("hbase.hbase_zk_port")); hconf.set(TableOutputFormat.OUTPUT_TABLE,conf.getString("hbase.tablename")) val jobConfig: JobConf = new JobConf(hconf, this.getClass) jobConfig.set("mapreduce.output.fileoutputformat.outputdir", "/user/user01/out") jobConfig.setOutputFormat(classOf[TableOutputFormat]) jobConfig.set(TableOutputFormat.OUTPUT_TABLE, conf.getString("hbase.tablename")) try{ rdd.map(convertToPut). saveAsHadoopDataset(jobConfig) } the method convertToPut does nothing but jsut converts the json to Put objects of HBase After i killed the application/driver the number of connection decreased drastically Kindly help in understanding and resolving the issue -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-opening-to-many-connection-with-zookeeper-tp25137.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org