Hello again, I just found something interesting in the logs: INFO org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat: setScan with ranges: 5192296858534827628530496329220096 - 5192343374370748142029900260897474 ( 46515835920513499403931677378)
But in my case, it should more be from 1020576114013268896970538800 to 72576215356229636519498348368 (when interpreting those numbers as the arbitrary precision integer representation of the row key). Best regards, Lukas On Mon, Jan 24, 2011 at 10:07 AM, Mr. Lukas <mr.bobu...@gmail.com> wrote: > Hi Dmitriy, > Sorry for the late reply, I was out of office. > Discarding the caster and caching option (i.e. using only the -loadkey > option) does not change anything except that some > FIELD_DISCARDED_TYPE_CONVERSION_FAILED warnings are issued. > > On Fri, Jan 21, 2011 at 1:42 AM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote: >> >> This is quite odd because I do the same thing on a multi-million row table >> and get multiple regions ... >> You do have multiple regions, right? What happens if you only specify the >> -loadKey parameter and none of the others? >> >> On Thu, Jan 20, 2011 at 8:24 AM, Mr. Lukas <mr.bobu...@gmail.com> wrote: >> >> > Hi pig users, >> > I'm also using pig 0.8 together with HBase 0.20.6 and think, my problem is >> > related to Ian's. When processing a table with millions of rows (stored in >> > multiple), HBaseStorage won't scan the full table but only a few hundred >> > records. >> > >> > The following minimal example reproduces my problem (for this table): >> > >> > REGISTER '/path/to/guava-r07.jar' >> > SET DEFAULT_PARALLEL 30; >> > items = LOAD 'hbase://some-table' USING >> > org.apache.pig.backend.hadoop.hbase.HBaseStorage('family:column', '-caster >> > HBaseBinaryConverter -caching 500 -loadKey') AS (key:bytearray, >> > a_column:long); >> > items = GROUP items ALL; >> > item_count = FOREACH items GENERATE COUNT_STAR($1); >> > DUMP item_count >> > >> > Pig issues just one mapper and I guess, that it scans just one region of >> > the >> > table. Or did i miss some fundamental configuration options? >> > >> > Best regards, >> > Lukas >> > >