Re: Getting all the columns of row-key at once

2016-01-22 Thread Anoop John
This second loop will work for you right? You want to have simple code? U know all the column names? (cf:qual) In Result there is a method getValue(byte [] family, byte [] qualifier) which will return the value of the latest version cell with given cf:qual. As u have only one version always it

HBase 0.98.0 with Spark 1.5.3

2016-01-22 Thread Ajinkya Kale
I am using this code-example http://www.vidyasource.com/blog/Programming/Scala/Java/Data/Hadoop/Analytics/2014/01/25/lighting-a-spark-with-hbase to read a hbase table using Spark with the only change of adding the hbase.zookeeper.quorum through code as it is not picking it from the hbase-site.xml.

Re: HFile vs Parquet for very wide table

2016-01-22 Thread Jerry He
Parquet may be more efficient in your use case, coupled with a upper layer query engine. But Parquet has schema. Schema can evolve though. e.g. adding columns in new Parquet files. HBase would be able to do the job too, and it schema-less -- you can add columns freely. Jerry On Fri, Jan 22, 2016

Re: data write/read consistency issue

2016-01-22 Thread Stack
On Fri, Jan 22, 2016 at 1:51 AM, Rural Hunter wrote: > Hi, > > I have a hbase cluster with 7 servers at version 0.98.13-hadoop2, > dfs.replication=2. > In a write session, we update some data. Then in a new read session > immediately, we read the data using Get class and found it sometimes > retu

Re: HFile vs Parquet for very wide table

2016-01-22 Thread Krishna
Thanks Ted, Jerry. Computing pairwise similarity is the primary purpose of the matrix. This is done by extracting all rows for a set of columns at each iteration. On Thursday, January 21, 2016, Jerry He wrote: > What do you want to do with your matrix data? How do you want to use it? > Do you

data write/read consistency issue

2016-01-22 Thread Rural Hunter
Hi, I have a hbase cluster with 7 servers at version 0.98.13-hadoop2, dfs.replication=2. In a write session, we update some data. Then in a new read session immediately, we read the data using Get class and found it sometimes returns the old version of the data(before the update). We have to add a