I tried to explain why in my basic understanding an operation in a foreach (count, count_star or anything else) will not leed to any success. And I still appreciate any hints or tricks to achieve the above.
2013/5/29 Shahab Yunus <[email protected]> > So basically this means that we were trying to look at this from RDBMS' SQL > perspective where 'SELECT COUNT(*) FROM TABLE' returns 0 even if there is > nothing in the result set and that is why we ignored the possibility that > FOREACH might not being executed at all (which could be by design)? > > -Shahab > > > On Wed, May 29, 2013 at 10:13 AM, Marco Brinkmann > <[email protected]>wrote: > > > Thanks, but this does not change anything. My personal guess (and I only > > work for a few days with pig) is that FOREACH will never be executed, > > because the relation 'test' is empty. > > > > > > 2013/5/29 Shahab Yunus <[email protected]> > > > > > Try COUNT_STAR. > > > > > > -Shahab > > > > > > > > > On Wed, May 29, 2013 at 9:55 AM, Marco Brinkmann < > > [email protected] > > > >wrote: > > > > > > > Hi everybody, > > > > > > > > I have a rather simple question and scenario, but still I could not > > find > > > an > > > > answer in the documention or in other resource: > > > > > > > > id, valid > > > > (1, false) > > > > (2, false) > > > > > > > > records = LOAD 'test.csv' USING PigStorage(',') AS (id:long, > > > > valid:boolean); > > > > > > > > test = FILTER records BY valid == true; > > > > test_count = FOREACH (GROUP test ALL) GENERATE COUNT(test); > > > > > > > > DUMP test_count; > > > > > > > > > > > > I would expect that 'valid_count' nows contains '0'. But the dump is > > > > completely empty (with 'valid == false' I get '(2)' as expected). I > use > > > pig > > > > 0.11.1. > > > > > > > > Could someone point me in the right direction? > > > > > > > > Cheers, Marco > > > > > > > > > >
