I need to dig deeper into saveAsHadoopDataset to see what might have caused the effect you observed.
Cheers On Tue, Oct 20, 2015 at 8:57 AM, Amit Hora <[email protected]> wrote: > Hi Ted, > > I made mistake last time yes the connection are very controlled when I > used put like iterated over rdd for each and within that for each partition > made connection and executed put list for hbase > > But why it was that the connection were getting too much when I used > hibconf and storehadoopdataset method? > ------------------------------ > From: Amit Hora <[email protected]> > Sent: 20-10-2015 20:38 > To: Ted Yu <[email protected]> > Cc: user <[email protected]> > Subject: RE: Spark opening to many connection with zookeeper > > I used that also but the number of connection goes on increasing started > frm 10 and went till 299 > Than I changed my zookeeper conf to set max client connection to just 30 > and restarted job > Now the connections are between 18- 24 from last 2 hours > > I am unable to understand such a behaviour > ------------------------------ > From: Ted Yu <[email protected]> > Sent: 20-10-2015 20:19 > To: Amit Hora <[email protected]> > Cc: user <[email protected]> > Subject: Re: Spark opening to many connection with zookeeper > > Can you take a look at example 37 on page 225 of: > http://hbase.apache.org/apache_hbase_reference_guide.pdf > > You can use the following method of Table: > > void put(List<Put> puts) throws IOException; > > After the put() returns, the connection is closed. > > Cheers > > On Tue, Oct 20, 2015 at 2:40 AM, Amit Hora <[email protected]> wrote: > >> One region >> ------------------------------ >> From: Ted Yu <[email protected]> >> Sent: 20-10-2015 15:01 >> To: Amit Singh Hora <[email protected]> >> Cc: user <[email protected]> >> Subject: Re: Spark opening to many connection with zookeeper >> >> How many regions do your table have ? >> >> Which hbase release do you use ? >> >> Cheers >> >> On Tue, Oct 20, 2015 at 12:32 AM, Amit Singh Hora <[email protected]> >> wrote: >> >>> Hi All , >>> >>> My spark job started reporting zookeeper errors after seeing the zkdumps >>> from Hbase master i realized that there are N number of connection being >>> made from the nodes where worker of spark are running i believe some how >>> the connections are not getting closed that is leading to error >>> >>> please find below code >>> >>> val conf = ConfigFactory.load("connection.conf").getConfig("connection") >>> val hconf = HBaseConfiguration.create(); >>> hconf.set(TableOutputFormat.OUTPUT_TABLE, >>> conf.getString("hbase.tablename")) >>> hconf.set("zookeeper.session.timeout", >>> conf.getString("hbase.zookeepertimeout")); >>> hconf.set("hbase.client.retries.number", Integer.toString(1)); >>> hconf.set("zookeeper.recovery.retry", Integer.toString(1)); >>> hconf.set("hbase.master", conf.getString("hbase.hbase_master")); >>> >>> >>> hconf.set("hbase.zookeeper.quorum",conf.getString("hbase.hbase_zkquorum")); >>> // zkquorum consists of 5 nodes >>> hconf.set("zookeeper.znode.parent", "/hbase-unsecure"); >>> hconf.set("hbase.zookeeper.property.clientPort", >>> conf.getString("hbase.hbase_zk_port")); >>> >>> >>> hconf.set(TableOutputFormat.OUTPUT_TABLE,conf.getString("hbase.tablename")) >>> val jobConfig: JobConf = new JobConf(hconf, this.getClass) >>> jobConfig.set("mapreduce.output.fileoutputformat.outputdir", >>> "/user/user01/out") >>> jobConfig.setOutputFormat(classOf[TableOutputFormat]) >>> jobConfig.set(TableOutputFormat.OUTPUT_TABLE, >>> conf.getString("hbase.tablename")) >>> >>> try{ >>> rdd.map(convertToPut). >>> saveAsHadoopDataset(jobConfig) >>> } >>> >>> the method convertToPut does nothing but jsut converts the json to Put >>> objects of HBase >>> >>> After i killed the application/driver the number of connection decreased >>> drastically >>> >>> Kindly help in understanding and resolving the issue >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-opening-to-many-connection-with-zookeeper-tp25137.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> >
