Hi

I'm accessing multiple regions (~5k) of an HBase table using spark's
newAPIHadoopRDD. But the driver is trying to calculate the region size of
all the regions.
It is not even reusing the hconnection and creting a new connection for
every request (see below) which is taking lots of time.

Is there a better approach to do this?


8 Nov 2016 22:25:22,759] [INFO Driver] RecoverableZooKeeper: Process
identifier=*hconnection-0x1e7824af* connecting to ZooKeeper ensemble=
hbase19.cloud.com:2181,hbase24.cloud.com:2181,hbase28.cloud.com:2181
[18 Nov 2016 22:25:22,759] [INFO Driver] ZooKeeper: Initiating client
connection, connectString=hbase19.cloud.com:2181,hbase24.cloud.com:2181,
hbase28.cloud.com:2181 sessionTimeout=60000
watcher=hconnection-0x1e7824af0x0, quorum=hbase19.cloud.com:2181,
hbase24.cloud.com:2181,hbase28.cloud.com:2181, baseZNode=/hbase
[18 Nov 2016 22:25:22,761] [INFO Driver-SendThread(hbase24.cloud.com:2181)]
ClientCnxn: Opening socket connection to server
hbase24.cloud.com/10.193.150.217:2181. Will not attempt to authenticate
using SASL (unknown error)
[18 Nov 2016 22:25:22,763] [INFO Driver-SendThread(hbase24.cloud.com:2181)]
ClientCnxn: Socket connection established, initiating session, client: /
10.193.138.145:47891, server: hbase24.cloud.com/10.193.150.217:2181
[18 Nov 2016 22:25:22,766] [INFO Driver-SendThread(hbase24.cloud.com:2181)]
ClientCnxn: Session establishment complete on server
hbase24.cloud.com/10.193.150.217:2181, sessionid = 0x2564f6f013e0e95,
negotiated timeout = 60000
[18 Nov 2016 22:25:22,766] [INFO Driver] RegionSizeCalculator: Calculating
region sizes for table "message".
[18 Nov 2016 22:25:27,867] [INFO Driver]
ConnectionManager$HConnectionImplementation: Closing master protocol:
MasterService
[18 Nov 2016 22:25:27,868] [INFO Driver]
ConnectionManager$HConnectionImplementation: Closing zookeeper
sessionid=0x2564f6f013e0e95
[18 Nov 2016 22:25:27,869] [INFO Driver] ZooKeeper: Session:
0x2564f6f013e0e95 closed
[18 Nov 2016 22:25:27,869] [INFO Driver-EventThread] ClientCnxn:
EventThread shut down
[18 Nov 2016 22:25:27,880] [INFO Driver] RecoverableZooKeeper: Process
identifier=*hconnection-0x6a8a1efa* connecting to ZooKeeper ensemble=
hbase19.cloud.com:2181,hbase24.cloud.com:2181,hbase28.cloud.com:2181
[18 Nov 2016 22:25:27,880] [INFO Driver] ZooKeeper: Initiating client
connection, connectString=hbase19.cloud.com:2181,hbase24.cloud.com:2181,
hbase28.cloud.com:2181 sessionTimeout=60000
watcher=hconnection-0x6a8a1efa0x0, quorum=hbase19.cloud.com:2181,
hbase24.cloud.com:2181,hbase28.cloud.com:2181, baseZNode=/hbase
[18 Nov 2016 22:25:27,883] [INFO Driver-SendThread(hbase24.cloud.com:2181)]
ClientCnxn: Opening socket connection to server
hbase24.cloud.com/10.193.150.217:2181. Will not attempt to authenticate
using SASL (unknown error)
[18 Nov 2016 22:25:27,885] [INFO Driver-SendThread(hbase24.cloud.com:2181)]
ClientCnxn: Socket connection established, initiating session, client: /
10.193.138.145:47894, server: hbase24.cloud.com/10.193.150.217:2181
[18 Nov 2016 22:25:27,887] [INFO Driver-SendThread(hbase24.cloud.com:2181)]
ClientCnxn: Session establishment complete on server
hbase24.cloud.com/10.193.150.217:2181, sessionid = 0x2564f6f013e0e97,
negotiated timeout = 60000
[18 Nov 2016 22:25:27,888] [INFO Driver] RegionSizeCalculator: Calculating
region sizes for table "message".
....

-- 
Thanks & Regards,

*Mukesh Jha <me.mukesh....@gmail.com>*

Reply via email to