[FORMATTING correction, apologies]
Here's one sloppy solution:
rmf temp;
STORE a INTO 'temp';
--load the bag as a chararray and morph it to my will
new = LOAD 'temp' USING PigStorage() AS (
id: chararray,
bitmap: chararray
);
-- remove all the {()} and strong split into a tuple on the commas
i = FOREACH new GENERATE
id,
STRSPLIT( REPLACE(bitmap,'[\\{\\(\\)\\} ]',''),
',', 99999) AS bitmap
;
So this works, but it's actually supposed to be part of a macro (new for us,
and I didn't try yet, but the doc says we can't execute grunt shell commands in
a Macro, so we wouldn't be able to "rmf temp";)
Still seems like I'm missing something on how to dereference the elements to
get what I want directly.
Steve
-----Original Message-----
I have a post-grouping relation:
a = { id: chararray, bitmap{ (value_binary: int) } },
where the value_binary tuples are single-element tuples that have been
sorted--the order of the single-element tuples is important. All the "bitmap"
bags are guaranteed to have the same number of single element tuples, but that
number is arbitrary. That is, I can't depend in advance on knowing how many
tuples there will be in "bitmap", but I can depend on each bitmap having the
same number of tuples. An example of an instance with 5 tuples:
9 {(1),(0),(0),(0),(0)}
Would need to become:
9 {(1,0,0,0,0)}
...concatenating those tuples into one tuple, preserving the order, again
without having advance knowledge of how many tuples will be in "bitmap". I
can't figure out how to do it.
Thanks in advance for any suggestions...
Steve