You can use HBase from HDP 2.2 on hdfs 2.5 If you have further question, let's take it offline.
Cheers On Thu, Nov 13, 2014 at 6:12 PM, Jianshi Huang <[email protected]> wrote: > But HDP 2.2 uses HDFS 2.6.0... very hard to convince our admins to upgrade. > > Would you recommend us to upgrade to 2.6.0? I'll ask them to consult HWX if > you say yes. :) > > Jianshi > > On Fri, Nov 14, 2014 at 9:42 AM, Ted Yu <[email protected]> wrote: > > > No. > > The upcoming HDP 2.2 does have that fix. > > > > Cheers > > > > On Thu, Nov 13, 2014 at 5:38 PM, Jianshi Huang <[email protected]> > > wrote: > > > > > Oh, btw, is latest HDP 2.1(0.98.0.2.1.7.0-784-hadoop2) have this fix? > > > > > > Jianshi > > > > > > On Fri, Nov 14, 2014 at 9:37 AM, Jianshi Huang < > [email protected]> > > > wrote: > > > > > > > Thanks Ted. > > > > > > > > I think the fix you mentioned is this one HBASE-12078 > > > > <https://issues.apache.org/jira/browse/HBASE-12078>. > > > > > > > > Not sure when our Hadoop admin would upgrade it, ahhh.... > > > > > > > > Jianshi > > > > > > > > On Thu, Nov 13, 2014 at 11:15 PM, Ted Yu <[email protected]> > wrote: > > > > > > > >> Keep in mind that Prefix Tree encoding has higher overhead in write > > path > > > >> compared to other data block encoding methods. > > > >> > > > >> Please use 0.98.7 which has the latest fixes for Prefix Tree > encoding. > > > >> > > > >> Cheers > > > >> > > > >> On Thu, Nov 13, 2014 at 1:27 AM, Jianshi Huang < > > [email protected] > > > > > > > >> wrote: > > > >> > > > >> > Thanks Ram, > > > >> > > > > >> > How about Prefix Tree based encoding then? HBASE-4676 > > > >> > <https://issues.apache.org/jira/browse/HBASE-4676> says it's also > > > >> possible > > > >> > to do suffix tries? Then it could be a nice fit for JSON String > (or > > > any > > > >> > long value where changes are small). > > > >> > > > > >> > Maybe I should just flatten JSON to columns, hmm...what's the > > overhead > > > >> for > > > >> > a column? > > > >> > > > > >> > Jianshi > > > >> > > > > >> > On Thu, Nov 13, 2014 at 4:49 PM, ramkrishna vasudevan < > > > >> > [email protected]> wrote: > > > >> > > > > >> > > >>So is it possible to specify FASTDIFF for rowkey/column and > DIFF > > > for > > > >> > > value > > > >> > > cell? > > > >> > > No that is not possible now. All the encoding is per KV only. > > > >> > > But what you say is definitely worth trying. > > > >> > > > > > >> > > >>So would you recommend storing JSON flattened as many columns? > > > >> > > May be yes. But I have practically not used JSON formats so I > may > > > >> not be > > > >> > > the best person to comment on this. > > > >> > > > > > >> > > Regards > > > >> > > Ram > > > >> > > > > > >> > > On Thu, Nov 13, 2014 at 2:01 PM, Jianshi Huang < > > > >> [email protected]> > > > >> > > wrote: > > > >> > > > > > >> > > > Thanks Ram, > > > >> > > > > > > >> > > > So is it possible to specify FASTDIFF for rowkey/column and > DIFF > > > for > > > >> > > value > > > >> > > > cell? > > > >> > > > > > > >> > > > So would you recommend storing JSON flattened as many columns? > > > >> > > > > > > >> > > > Jianshi > > > >> > > > > > > >> > > > On Thu, Nov 13, 2014 at 2:08 PM, ramkrishna vasudevan < > > > >> > > > [email protected]> wrote: > > > >> > > > > > > >> > > > > Hi > > > >> > > > > > > > >> > > > > >> Since I'm storing > > > >> > > > > historical data (snapshot data) and changes between adjacent > > > value > > > >> > > cells > > > >> > > > > are relatively small. > > > >> > > > > > > > >> > > > > If the values are changing even if it is smaller the > FASTDIFF > > > will > > > >> > > > rewrite > > > >> > > > > the value part. Only if there are exact matches then it > would > > > >> skip > > > >> > the > > > >> > > > > value part. JFYI. > > > >> > > > > > > > >> > > > > Regards > > > >> > > > > Ram > > > >> > > > > > > > >> > > > > On Thu, Nov 13, 2014 at 11:23 AM, Jianshi Huang < > > > >> > > [email protected] > > > >> > > > > > > > >> > > > > wrote: > > > >> > > > > > > > >> > > > > > I thought FASTDIFF was only for rowkey and columns, great > if > > > it > > > >> > also > > > >> > > > > works > > > >> > > > > > in value cell. > > > >> > > > > > > > > >> > > > > > And thanks for the bjson link! > > > >> > > > > > > > > >> > > > > > Jianshi > > > >> > > > > > > > > >> > > > > > On Thu, Nov 13, 2014 at 1:18 PM, Ted Yu < > > [email protected]> > > > >> > wrote: > > > >> > > > > > > > > >> > > > > > > There is FASTDIFF data block encoding. > > > >> > > > > > > > > > >> > > > > > > See also http://bjson.org/ > > > >> > > > > > > > > > >> > > > > > > Cheers > > > >> > > > > > > > > > >> > > > > > > On Nov 12, 2014, at 9:08 PM, Jianshi Huang < > > > >> > > [email protected]> > > > >> > > > > > > wrote: > > > >> > > > > > > > > > >> > > > > > > > Hi, > > > >> > > > > > > > > > > >> > > > > > > > I'm currently saving JSON in pure String format in the > > > value > > > >> > cell > > > >> > > > and > > > >> > > > > > > > depends on HBase' block compression to reduce the > > overhead > > > >> of > > > >> > > JSON. > > > >> > > > > > > > > > > >> > > > > > > > I'm wondering if there's a more space efficient way to > > > store > > > >> > > JSON? > > > >> > > > > > > > (there're lots of 0s and 1s, JSON String actually is > an > > OK > > > >> > > format) > > > >> > > > > > > > > > > >> > > > > > > > I want to keep the value as a Map since the schema of > > > source > > > >> > data > > > >> > > > > might > > > >> > > > > > > > change over time. > > > >> > > > > > > > > > > >> > > > > > > > Also is there a DIFF based encoding for values? Since > > I'm > > > >> > storing > > > >> > > > > > > > historical data (snapshot data) and changes between > > > adjacent > > > >> > > value > > > >> > > > > > cells > > > >> > > > > > > > are relatively small. > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > Thanks, > > > >> > > > > > > > -- > > > >> > > > > > > > Jianshi Huang > > > >> > > > > > > > > > > >> > > > > > > > LinkedIn: jianshi > > > >> > > > > > > > Twitter: @jshuang > > > >> > > > > > > > Github & Blog: http://huangjs.github.com/ > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > -- > > > >> > > > > > Jianshi Huang > > > >> > > > > > > > > >> > > > > > LinkedIn: jianshi > > > >> > > > > > Twitter: @jshuang > > > >> > > > > > Github & Blog: http://huangjs.github.com/ > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > -- > > > >> > > > Jianshi Huang > > > >> > > > > > > >> > > > LinkedIn: jianshi > > > >> > > > Twitter: @jshuang > > > >> > > > Github & Blog: http://huangjs.github.com/ > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > -- > > > >> > Jianshi Huang > > > >> > > > > >> > LinkedIn: jianshi > > > >> > Twitter: @jshuang > > > >> > Github & Blog: http://huangjs.github.com/ > > > >> > > > > >> > > > > > > > > > > > > > > > > -- > > > > Jianshi Huang > > > > > > > > LinkedIn: jianshi > > > > Twitter: @jshuang > > > > Github & Blog: http://huangjs.github.com/ > > > > > > > > > > > > > > > > -- > > > Jianshi Huang > > > > > > LinkedIn: jianshi > > > Twitter: @jshuang > > > Github & Blog: http://huangjs.github.com/ > > > > > > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ >
