[
https://issues.apache.org/jira/browse/PIG-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheolsoo Park resolved PIG-3319.
--------------------------------
Resolution: Won't Fix
I ended up fixing my LoadFunc.
Originally, I thought I would fix this in Pig because then any LoadFunc that
uses TupleFactory.newTupleNoCopy won't run into this problem. But that seems to
require a lot more changes.
I am marking the jira as won't fix for now.
> Race condition in POStream
> --------------------------
>
> Key: PIG-3319
> URL: https://issues.apache.org/jira/browse/PIG-3319
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.11.1
> Reporter: Cheolsoo Park
> Assignee: Cheolsoo Park
> Fix For: 0.12
>
>
> When LOAD is immediately followed by STREAM, Pig job intermittently fails
> with either ConcurrentModificationException or IndexOutOfBoundsException.
> {code}
> a = LOAD '<input>' USING MyLoadFunc();
> b = STREAM a THROUGH dummy AS (foo:chararray);
> DUMP b;
> {code}
> The problem is that if the LoadFunc creates a new tuple using
> TupleFactory.newTupleNoCopy, the fields list object is reused, and it can be
> concurrently modified by ProcessInputThread and POStream.
> {code}
> /**
> * Create a tuple from a provided list of objects, keeping the provided
> * list. The new tuple will take over ownership of the provided list.
> * @param list List of objects that will become the fields of the tuple.
> * @return A tuple with the list objects as its fields
> */
> public abstract Tuple newTupleNoCopy(List list);
> {code}
> Here is an example:
> # LoadFunc loads a line and creates a new tuple using List<Object> L.
> # POStream passes it to the ProcessInputThread of ExecutableManager.
> # ProcessInputThread starts iterating L to serialize it before feeding it to
> the sub-process.
> # LoadFunc loads another line and creates a new tuple by re-using L.
> # ConcurrentModificationException is thrown because L is modified while being
> iterated.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira