[jira] [Created] (PIG-5425) Pig 0.15 and later don't set context signature correctly

2022-07-12 Thread Jacob Tolar (Jira)
Jacob Tolar created PIG-5425:


 Summary: Pig 0.15 and later don't set context signature correctly
 Key: PIG-5425
 URL: https://issues.apache.org/jira/browse/PIG-5425
 Project: Pig
  Issue Type: Improvement
Reporter: Jacob Tolar


As an author of Pig UDFs, my expectation in EvalFunc ( 
[https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java]
 ) is that {{setUDFContextSignature}} would be called before 
{{setInputSchema}}. This was previously the case up through Pig 0.14

 
In Pig 0.15 and later (according to the git tags, at least; I've only checked 
0.17), this is not true.

This commit introduces the problem behavior: 
[https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81]


The issue is in 
src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java  
line 513 ([git blame 
link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513])
 introduced in that commit. 

 

There, {{f.setInputSchema()}} is called without previously calling 
{{f.setUDFContextSignature(signature)}}. 

Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} is 
called, but POUserFunc [re-instantiates the EvalFunc and does not actually use 
the func argument passed in its 
constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128]
 (quite confusing, but probably attributable to changes over time). 

{{f}} is discarded, so it should be safe to simply call 
{{f.setUdfContextSignature(signature)}} as a simple fix.

The code here is arguably unnecessarily complex and could probably be cleaned 
up further, but I propose the simple fix above without a larger refactoring.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (PIG-5425) Pig 0.15 and later don't set context signature correctly

2022-07-12 Thread Jacob Tolar (Jira)


 [ 
https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Tolar updated PIG-5425:
-
Attachment: PIG-5425.0.patch
Status: Patch Available  (was: Open)

> Pig 0.15 and later don't set context signature correctly
> 
>
> Key: PIG-5425
> URL: https://issues.apache.org/jira/browse/PIG-5425
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Priority: Major
> Attachments: PIG-5425.0.patch
>
>
> As an author of Pig UDFs, my expectation in EvalFunc ( 
> [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java]
>  ) is that {{setUDFContextSignature}} would be called before 
> {{setInputSchema}}. This was previously the case up through Pig 0.14
>  
> In Pig 0.15 and later (according to the git tags, at least; I've only checked 
> 0.17), this is not true.
> This commit introduces the problem behavior: 
> [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81]
> The issue is in 
> src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 
>  line 513 ([git blame 
> link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513])
>  introduced in that commit. 
>  
> There, {{f.setInputSchema()}} is called without previously calling 
> {{f.setUDFContextSignature(signature)}}. 
> Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} 
> is called, but POUserFunc [re-instantiates the EvalFunc and does not actually 
> use the func argument passed in its 
> constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128]
>  (quite confusing, but probably attributable to changes over time). 
> {{f}} is discarded, so it should be safe to simply call 
> {{f.setUdfContextSignature(signature)}} as a simple fix.
> The code here is arguably unnecessarily complex and could probably be cleaned 
> up further, but I propose the simple fix above without a larger refactoring.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PIG-5425) Pig 0.15 and later don't set context signature correctly

2022-07-12 Thread Jacob Tolar (Jira)


[ 
https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566025#comment-17566025
 ] 

Jacob Tolar commented on PIG-5425:
--

Attached patch with the suggested fix.

> Pig 0.15 and later don't set context signature correctly
> 
>
> Key: PIG-5425
> URL: https://issues.apache.org/jira/browse/PIG-5425
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Priority: Major
> Attachments: PIG-5425.0.patch
>
>
> As an author of Pig UDFs, my expectation in EvalFunc ( 
> [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java]
>  ) is that {{setUDFContextSignature}} would be called before 
> {{setInputSchema}}. This was previously the case up through Pig 0.14
>  
> In Pig 0.15 and later (according to the git tags, at least; I've only checked 
> 0.17), this is not true.
> This commit introduces the problem behavior: 
> [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81]
> The issue is in 
> src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 
>  line 513 ([git blame 
> link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513])
>  introduced in that commit. 
>  
> There, {{f.setInputSchema()}} is called without previously calling 
> {{f.setUDFContextSignature(signature)}}. 
> Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} 
> is called, but POUserFunc [re-instantiates the EvalFunc and does not actually 
> use the func argument passed in its 
> constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128]
>  (quite confusing, but probably attributable to changes over time). 
> {{f}} is discarded, so it should be safe to simply call 
> {{f.setUdfContextSignature(signature)}} as a simple fix.
> The code here is arguably unnecessarily complex and could probably be cleaned 
> up further, but I propose the simple fix above without a larger refactoring.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PIG-5425) Pig 0.15 and later don't set context signature correctly

2022-07-12 Thread Jacob Tolar (Jira)


[ 
https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566039#comment-17566039
 ] 

Jacob Tolar commented on PIG-5425:
--

If you prefer the somewhat more invasive refactoring, I can do that, I was just 
trying to minimize the risk of causing any other issues.

> Pig 0.15 and later don't set context signature correctly
> 
>
> Key: PIG-5425
> URL: https://issues.apache.org/jira/browse/PIG-5425
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jacob Tolar
>Priority: Major
> Attachments: PIG-5425.0.patch
>
>
> As an author of Pig UDFs, my expectation in EvalFunc ( 
> [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java]
>  ) is that {{setUDFContextSignature}} would be called before 
> {{setInputSchema}}. This was previously the case up through Pig 0.14
>  
> In Pig 0.15 and later (according to the git tags, at least; I've only checked 
> 0.17), this is not true.
> This commit introduces the problem behavior: 
> [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81]
> The issue is in 
> src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 
>  line 513 ([git blame 
> link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513])
>  introduced in that commit. 
>  
> There, {{f.setInputSchema()}} is called without previously calling 
> {{f.setUDFContextSignature(signature)}}. 
> Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} 
> is called, but POUserFunc [re-instantiates the EvalFunc and does not actually 
> use the func argument passed in its 
> constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128]
>  (quite confusing, but probably attributable to changes over time). 
> {{f}} is discarded, so it should be safe to simply call 
> {{f.setUdfContextSignature(signature)}} as a simple fix.
> The code here is arguably unnecessarily complex and could probably be cleaned 
> up further, but I propose the simple fix above without a larger refactoring.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] Subscription: PIG patch available

2022-07-12 Thread jira
Issue Subscription
Filter: PIG patch available (39 issues)

Subscriber: pigdaily

Key Summary
PIG-5425Pig 0.15 and later don't set context signature correctly
https://issues.apache.org/jira/browse/PIG-5425
PIG-5418Utils.parseSchema(String), parseConstant(String) leak memory
https://issues.apache.org/jira/browse/PIG-5418
PIG-5414Build failure on Linux ARM64 due to old Apache Avro
https://issues.apache.org/jira/browse/PIG-5414
PIG-5380SortedDataBag hitting ConcurrentModificationException or producing 
incorrect output in a corner-case 
https://issues.apache.org/jira/browse/PIG-5380
PIG-5377Move supportsParallelWriteToStoreLocation from StoreFunc to 
StoreFuncInterfce
https://issues.apache.org/jira/browse/PIG-5377
PIG-5369Add llap-client dependency
https://issues.apache.org/jira/browse/PIG-5369
PIG-5360Pig sets working directory of input file systems causes exception 
thrown
https://issues.apache.org/jira/browse/PIG-5360
PIG-5338Prevent deep copy of DataBag into Jython List
https://issues.apache.org/jira/browse/PIG-5338
PIG-5323Implement LastInputStreamingOptimizer in Tez
https://issues.apache.org/jira/browse/PIG-5323
PIG-5273_SUCCESS file should be created at the end of the job
https://issues.apache.org/jira/browse/PIG-5273
PIG-5256Bytecode generation for POFilter and POForeach
https://issues.apache.org/jira/browse/PIG-5256
PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown 
NPE in multithread env
https://issues.apache.org/jira/browse/PIG-5160
PIG-5115Builtin AvroStorage generates incorrect avro schema when the same 
pig field name appears in the alias
https://issues.apache.org/jira/browse/PIG-5115
PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive 
set to true
https://issues.apache.org/jira/browse/PIG-5106
PIG-5081Can not run pig on spark source code distribution
https://issues.apache.org/jira/browse/PIG-5081
PIG-5080Support store alias as spark table
https://issues.apache.org/jira/browse/PIG-5080
PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput
https://issues.apache.org/jira/browse/PIG-5057
PIG-5029Optimize sort case when data is skewed
https://issues.apache.org/jira/browse/PIG-5029
PIG-4926Modify the content of start.xml for spark mode
https://issues.apache.org/jira/browse/PIG-4926
PIG-4913Reduce jython function initiation during compilation
https://issues.apache.org/jira/browse/PIG-4913
PIG-4849pig on tez will cause tez-ui to crash,because the content from 
timeline server is too long. 
https://issues.apache.org/jira/browse/PIG-4849
PIG-4750REPLACE_MULTI should compile Pattern once and reuse it
https://issues.apache.org/jira/browse/PIG-4750
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues.apache.org/jira/browse/PIG-4656
PIG-4598Allow user defined plan optimizer rules
https://issues.apache.org/jira/browse/PIG-4598
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues.apache.org/jira/browse/PIG-4515
PIG-4373Implement PIG-3861 in Tez
https://issues.apache.org/jira/browse/PIG-4373
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587
PIG-1804Alow Jython function to implement Algebraic a