Hi All ,
My spark job started reporting zookeeper errors after seeing the zkdumps
from Hbase master i realized that there are N number of connection being
made from the nodes where worker of spark are running i believe some how
the connections are not getting closed that is leading to error
please find below code
val conf = ConfigFactory.load("connection.conf").getConfig("connection")
val hconf = HBaseConfiguration.create();
hconf.set(TableOutputFormat.OUTPUT_TABLE,
conf.getString("hbase.tablename"))
hconf.set("zookeeper.session.timeout",
conf.getString("hbase.zookeepertimeout"));
hconf.set("hbase.client.retries.number", Integer.toString(1));
hconf.set("zookeeper.recovery.retry", Integer.toString(1));
hconf.set("hbase.master", conf.getString("hbase.hbase_master"));
hconf.set("hbase.zookeeper.quorum",conf.getString("hbase.hbase_zkquorum"));
// zkquorum consists of 5 nodes
hconf.set("zookeeper.znode.parent", "/hbase-unsecure");
hconf.set("hbase.zookeeper.property.clientPort",
conf.getString("hbase.hbase_zk_port"));
hconf.set(TableOutputFormat.OUTPUT_TABLE,conf.getString("hbase.tablename"))
val jobConfig: JobConf = new JobConf(hconf, this.getClass)
jobConfig.set("mapreduce.output.fileoutputformat.outputdir",
"/user/user01/out")
jobConfig.setOutputFormat(classOf[TableOutputFormat])
jobConfig.set(TableOutputFormat.OUTPUT_TABLE,
conf.getString("hbase.tablename"))
try{
rdd.map(convertToPut).
saveAsHadoopDataset(jobConfig)
}
the method convertToPut does nothing but jsut converts the json to Put
objects of HBase
After i killed the application/driver the number of connection decreased
drastically
Kindly help in understanding and resolving the issue
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-opening-to-many-connection-with-zookeeper-tp25137.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]