>>So is it possible to specify FASTDIFF for rowkey/column and DIFF for value cell? No that is not possible now. All the encoding is per KV only. But what you say is definitely worth trying.
>>So would you recommend storing JSON flattened as many columns? May be yes. But I have practically not used JSON formats so I may not be the best person to comment on this. Regards Ram On Thu, Nov 13, 2014 at 2:01 PM, Jianshi Huang <[email protected]> wrote: > Thanks Ram, > > So is it possible to specify FASTDIFF for rowkey/column and DIFF for value > cell? > > So would you recommend storing JSON flattened as many columns? > > Jianshi > > On Thu, Nov 13, 2014 at 2:08 PM, ramkrishna vasudevan < > [email protected]> wrote: > > > Hi > > > > >> Since I'm storing > > historical data (snapshot data) and changes between adjacent value cells > > are relatively small. > > > > If the values are changing even if it is smaller the FASTDIFF will > rewrite > > the value part. Only if there are exact matches then it would skip the > > value part. JFYI. > > > > Regards > > Ram > > > > On Thu, Nov 13, 2014 at 11:23 AM, Jianshi Huang <[email protected] > > > > wrote: > > > > > I thought FASTDIFF was only for rowkey and columns, great if it also > > works > > > in value cell. > > > > > > And thanks for the bjson link! > > > > > > Jianshi > > > > > > On Thu, Nov 13, 2014 at 1:18 PM, Ted Yu <[email protected]> wrote: > > > > > > > There is FASTDIFF data block encoding. > > > > > > > > See also http://bjson.org/ > > > > > > > > Cheers > > > > > > > > On Nov 12, 2014, at 9:08 PM, Jianshi Huang <[email protected]> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > I'm currently saving JSON in pure String format in the value cell > and > > > > > depends on HBase' block compression to reduce the overhead of JSON. > > > > > > > > > > I'm wondering if there's a more space efficient way to store JSON? > > > > > (there're lots of 0s and 1s, JSON String actually is an OK format) > > > > > > > > > > I want to keep the value as a Map since the schema of source data > > might > > > > > change over time. > > > > > > > > > > Also is there a DIFF based encoding for values? Since I'm storing > > > > > historical data (snapshot data) and changes between adjacent value > > > cells > > > > > are relatively small. > > > > > > > > > > > > > > > Thanks, > > > > > -- > > > > > Jianshi Huang > > > > > > > > > > LinkedIn: jianshi > > > > > Twitter: @jshuang > > > > > Github & Blog: http://huangjs.github.com/ > > > > > > > > > > > > > > > > -- > > > Jianshi Huang > > > > > > LinkedIn: jianshi > > > Twitter: @jshuang > > > Github & Blog: http://huangjs.github.com/ > > > > > > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ >
