[jira] [Created] (PIG-5425) Pig 0.15 and later don't set context signature correctly
Jacob Tolar created PIG-5425: Summary: Pig 0.15 and later don't set context signature correctly Key: PIG-5425 URL: https://issues.apache.org/jira/browse/PIG-5425 Project: Pig Issue Type: Improvement Reporter: Jacob Tolar As an author of Pig UDFs, my expectation in EvalFunc ( [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] ) is that {{setUDFContextSignature}} would be called before {{setInputSchema}}. This was previously the case up through Pig 0.14 In Pig 0.15 and later (according to the git tags, at least; I've only checked 0.17), this is not true. This commit introduces the problem behavior: [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] The issue is in src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java line 513 ([git blame link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) introduced in that commit. There, {{f.setInputSchema()}} is called without previously calling {{f.setUDFContextSignature(signature)}}. Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} is called, but POUserFunc [re-instantiates the EvalFunc and does not actually use the func argument passed in its constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] (quite confusing, but probably attributable to changes over time). {{f}} is discarded, so it should be safe to simply call {{f.setUdfContextSignature(signature)}} as a simple fix. The code here is arguably unnecessarily complex and could probably be cleaned up further, but I propose the simple fix above without a larger refactoring. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacob Tolar updated PIG-5425: - Attachment: PIG-5425.0.patch Status: Patch Available (was: Open) > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Priority: Major > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566025#comment-17566025 ] Jacob Tolar commented on PIG-5425: -- Attached patch with the suggested fix. > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Priority: Major > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (PIG-5425) Pig 0.15 and later don't set context signature correctly
[ https://issues.apache.org/jira/browse/PIG-5425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566039#comment-17566039 ] Jacob Tolar commented on PIG-5425: -- If you prefer the somewhat more invasive refactoring, I can do that, I was just trying to minimize the risk of causing any other issues. > Pig 0.15 and later don't set context signature correctly > > > Key: PIG-5425 > URL: https://issues.apache.org/jira/browse/PIG-5425 > Project: Pig > Issue Type: Improvement >Reporter: Jacob Tolar >Priority: Major > Attachments: PIG-5425.0.patch > > > As an author of Pig UDFs, my expectation in EvalFunc ( > [https://github.com/apache/pig/blob/release-0.17.0/src/org/apache/pig/EvalFunc.java] > ) is that {{setUDFContextSignature}} would be called before > {{setInputSchema}}. This was previously the case up through Pig 0.14 > > In Pig 0.15 and later (according to the git tags, at least; I've only checked > 0.17), this is not true. > This commit introduces the problem behavior: > [https://github.com/apache/pig/commit/8af34f1971628d1eeb0cd1f07fe03397ca887b81] > The issue is in > src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java > line 513 ([git blame > link|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java#L513]) > introduced in that commit. > > There, {{f.setInputSchema()}} is called without previously calling > {{f.setUDFContextSignature(signature)}}. > Note that on line 509, {{((POUserFunc)p).setSignature(op.getSignature());}} > is called, but POUserFunc [re-instantiates the EvalFunc and does not actually > use the func argument passed in its > constructor|https://github.com/apache/pig/blame/release-0.17.0/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java#L119-L128] > (quite confusing, but probably attributable to changes over time). > {{f}} is discarded, so it should be safe to simply call > {{f.setUdfContextSignature(signature)}} as a simple fix. > The code here is arguably unnecessarily complex and could probably be cleaned > up further, but I propose the simple fix above without a larger refactoring. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (39 issues) Subscriber: pigdaily Key Summary PIG-5425Pig 0.15 and later don't set context signature correctly https://issues.apache.org/jira/browse/PIG-5425 PIG-5418Utils.parseSchema(String), parseConstant(String) leak memory https://issues.apache.org/jira/browse/PIG-5418 PIG-5414Build failure on Linux ARM64 due to old Apache Avro https://issues.apache.org/jira/browse/PIG-5414 PIG-5380SortedDataBag hitting ConcurrentModificationException or producing incorrect output in a corner-case https://issues.apache.org/jira/browse/PIG-5380 PIG-5377Move supportsParallelWriteToStoreLocation from StoreFunc to StoreFuncInterfce https://issues.apache.org/jira/browse/PIG-5377 PIG-5369Add llap-client dependency https://issues.apache.org/jira/browse/PIG-5369 PIG-5360Pig sets working directory of input file systems causes exception thrown https://issues.apache.org/jira/browse/PIG-5360 PIG-5338Prevent deep copy of DataBag into Jython List https://issues.apache.org/jira/browse/PIG-5338 PIG-5323Implement LastInputStreamingOptimizer in Tez https://issues.apache.org/jira/browse/PIG-5323 PIG-5273_SUCCESS file should be created at the end of the job https://issues.apache.org/jira/browse/PIG-5273 PIG-5256Bytecode generation for POFilter and POForeach https://issues.apache.org/jira/browse/PIG-5256 PIG-5160SchemaTupleFrontend.java is not thread safe, cause PigServer thrown NPE in multithread env https://issues.apache.org/jira/browse/PIG-5160 PIG-5115Builtin AvroStorage generates incorrect avro schema when the same pig field name appears in the alias https://issues.apache.org/jira/browse/PIG-5115 PIG-5106Optimize when mapreduce.input.fileinputformat.input.dir.recursive set to true https://issues.apache.org/jira/browse/PIG-5106 PIG-5081Can not run pig on spark source code distribution https://issues.apache.org/jira/browse/PIG-5081 PIG-5080Support store alias as spark table https://issues.apache.org/jira/browse/PIG-5080 PIG-5057IndexOutOfBoundsException when pig reducer processOnePackageOutput https://issues.apache.org/jira/browse/PIG-5057 PIG-5029Optimize sort case when data is skewed https://issues.apache.org/jira/browse/PIG-5029 PIG-4926Modify the content of start.xml for spark mode https://issues.apache.org/jira/browse/PIG-4926 PIG-4913Reduce jython function initiation during compilation https://issues.apache.org/jira/browse/PIG-4913 PIG-4849pig on tez will cause tez-ui to crash,because the content from timeline server is too long. https://issues.apache.org/jira/browse/PIG-4849 PIG-4750REPLACE_MULTI should compile Pattern once and reuse it https://issues.apache.org/jira/browse/PIG-4750 PIG-4684Exception should be changed to warning when job diagnostics cannot be fetched https://issues.apache.org/jira/browse/PIG-4684 PIG-4656Improve String serialization and comparator performance in BinInterSedes https://issues.apache.org/jira/browse/PIG-4656 PIG-4598Allow user defined plan optimizer rules https://issues.apache.org/jira/browse/PIG-4598 PIG-4551Partition filter is not pushed down in case of SPLIT https://issues.apache.org/jira/browse/PIG-4551 PIG-4539New PigUnit https://issues.apache.org/jira/browse/PIG-4539 PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException https://issues.apache.org/jira/browse/PIG-4515 PIG-4373Implement PIG-3861 in Tez https://issues.apache.org/jira/browse/PIG-4373 PIG-4323PackageConverter hanging in Spark https://issues.apache.org/jira/browse/PIG-4323 PIG-4313StackOverflowError in LIMIT operation on Spark https://issues.apache.org/jira/browse/PIG-4313 PIG-4002Disable combiner when map-side aggregation is used https://issues.apache.org/jira/browse/PIG-4002 PIG-3952PigStorage accepts '-tagSplit' to return full split information https://issues.apache.org/jira/browse/PIG-3952 PIG-3911Define unique fields with @OutputSchema https://issues.apache.org/jira/browse/PIG-3911 PIG-3877Getting Geo Latitude/Longitude from Address Lines https://issues.apache.org/jira/browse/PIG-3877 PIG-3873Geo distance calculation using Haversine https://issues.apache.org/jira/browse/PIG-3873 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues.apache.org/jira/browse/PIG-3668 PIG-3587add functionality for rolling over dates https://issues.apache.org/jira/browse/PIG-3587 PIG-1804Alow Jython function to implement Algebraic a