[
https://issues.apache.org/jira/browse/PIG-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971309#comment-15971309
]
Koji Noguchi commented on PIG-5224:
-----------------------------------
Stepping back a bit. Giving a little more details.
For the script in description, PhysicalPlan looks like
{noformat}
D: Store(/tmp/delteme:org.apache.pig.builtin.PigStorage) - scope-18
|
|---D: New For Each(false)[bag] - scope-17
| |
| POUserFunc(org.apache.pig.test.utils.AccumulatorBagCount)[int] -
scope-13
| |
| |---RelationToExpressionProject[bag][*] - scope-12
| |
| |---o: POSort[bag]() - scope-16
| | |
| | Project[int][0] - scope-15
| |
| |---Project[bag][0] - scope-14
|
|---****C: New For Each(false)[bag] - scope-11****
| |
| Project[bag][1] - scope-9
|
|---C: Package(Packager)[tuple]{int} - scope-6
{noformat}
where the "C" with "\*\*\*\*" is the extra foreach from columnpruning and this
foreach conflicts with {{AccumulatorOptimizerUtil.addAccumulator}} where it
looks at immediate successor of POPackage which is "\*\*\*\* C: New For Each
instead of "D: New For Each" that we want.
> Extra foreach from ColumnPrune preventing Accumulator usage
> -----------------------------------------------------------
>
> Key: PIG-5224
> URL: https://issues.apache.org/jira/browse/PIG-5224
> Project: Pig
> Issue Type: Improvement
> Reporter: Koji Noguchi
> Assignee: Koji Noguchi
> Attachments: pig-5224-v0-testonly.patch, pig-5224-v1.patch
>
>
> {code}
> A = load 'input' as (id:int, fruit);
> B = foreach A generate id; -- to enable columnprune
> C = group B by id;
> D = foreach C {
> o = order B by id;
> generate org.apache.pig.test.utils.AccumulatorBagCount(o);
> }
> STORE D into ...
> {code}
> Pig fails to use Accumulator interface for this UDF.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)