HBase uses utf-8 encoding to store the row keys, so it can store non-ascii characters too (yes they will be larger than 1 byte).
A relevant thread: http://search-hadoop.com/m/aJ0702Sq3Ii2/Scan+%2528Start+Row%252C+End+Row%2529+vs+Scan+%2528Row%2529&subj=RE+Scan+Start+Row+End+Row+vs+Scan+Row+ Hope this helps. Himanshu On Tue, Apr 26, 2011 at 11:43 PM, Hari Sreekumar <[email protected]>wrote: > Just in case if there are other characters which fall after ~. We were > using > 'z' before. Then we realized we had some special characters in keys. So we > updated it to ~. Does HBase support characters > ~ in row keys? > > hari > > On Tue, Apr 26, 2011 at 7:35 PM, Suraj Varma <[email protected]> wrote: > > > Why did you feel this is error prone? > > If you use server side filters by providing start/end Row or use a > > PrefixFilter, the scans should work well, as it is going to be > > sequential access. Depending on your data and use case, you may need > > to tune it further (say by applying additional filters, limit results, > > etc)... see here for some more tips on speeding up scans: > > http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A16 > > > > --Suraj > > > > > > On Tue, Apr 26, 2011 at 2:45 AM, Hari Sreekumar > > <[email protected]> wrote: > > > Hi, > > > > > > I need to scan rows which have rowskey starting with a particular > string > > > (say abc). I am currently doing this by using startrow=abc and > > endrow=abc~. > > > (I am appending ~ as it is ASCII 126). It usually works, but is there a > > > better, less error prone way? I know we can do this using filters, but > > won't > > > that be worse performance-wise? > > > > > > Thanks, > > > Hari > > > > > >
