[ 
https://issues.apache.org/jira/browse/PIG-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park resolved PIG-3319.
--------------------------------

    Resolution: Won't Fix

I ended up fixing my LoadFunc.

Originally, I thought I would fix this in Pig because then any LoadFunc that 
uses TupleFactory.newTupleNoCopy won't run into this problem. But that seems to 
require a lot more changes.

I am marking the jira as won't fix for now.
                
> Race condition in POStream
> --------------------------
>
>                 Key: PIG-3319
>                 URL: https://issues.apache.org/jira/browse/PIG-3319
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.11.1
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.12
>
>
> When LOAD is immediately followed by STREAM, Pig job intermittently fails 
> with either ConcurrentModificationException or IndexOutOfBoundsException. 
> {code}
> a = LOAD '<input>' USING MyLoadFunc();
> b = STREAM a THROUGH dummy AS (foo:chararray);
> DUMP b;
> {code}
> The problem is that if the LoadFunc creates a new tuple using 
> TupleFactory.newTupleNoCopy, the fields list object is reused, and it can be 
> concurrently modified by ProcessInputThread and POStream.
> {code}
> /**
>  * Create a tuple from a provided list of objects, keeping the provided
>  * list.  The new tuple will take over ownership of the provided list.
>  * @param list List of objects that will become the fields of the tuple.
>  * @return A tuple with the list objects as its fields
>  */
> public abstract Tuple newTupleNoCopy(List list);
> {code}
> Here is an example:
> # LoadFunc loads a line and creates a new tuple using List<Object> L.
> # POStream passes it to the ProcessInputThread of ExecutableManager.
> # ProcessInputThread starts iterating L to serialize it before feeding it to 
> the sub-process. 
> # LoadFunc loads another line and creates a new tuple by re-using L.
> # ConcurrentModificationException is thrown because L is modified while being 
> iterated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to