[ https://issues.apache.org/jira/browse/PIG-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005428#comment-14005428 ]
Daniel Dai commented on PIG-3938: --------------------------------- Basically Pig don't know how to cast bytes into concrete objects. In Pig's view, only the producer of the bytes knows how to cast them. LoadStoreCaster is introduced for LoadFunc. However, bytes from UDF output is missing. We might need to add a hook to EvalFunc to define a caster. > Type cast doesn't work after flatten result of UDF > -------------------------------------------------- > > Key: PIG-3938 > URL: https://issues.apache.org/jira/browse/PIG-3938 > Project: Pig > Issue Type: Bug > Components: internal-udfs > Affects Versions: 0.12.0, 0.11.1 > Reporter: Hongchang Li > > this ticket was very close to > http://stackoverflow.com/questions/8828839/how-can-correct-data-types-on-apache-pig-be-enforced. > To reproduce the issue, first, we have an UDF to cast map to bag, code almost > like(http://stackoverflow.com/questions/12476929/group-key-value-of-map-in-pig?answertab=votes#tab-top) > {code:title=test.pig} > $ cat test.pig > register polisan/maptobag.jar; > define MAPTOBAG maptobag.MAPTOBAG(); > A = load 'polisan/input1.txt' using PigStorage(' ') as (id:chararray, kv:[]); > B = foreach A generate id, MAPTOBAG(kv) as to_bag; > C = foreach B generate id, flatten(to_bag) as (key:chararray, > value:chararray); > D = group C by (id, key); > E = foreach D generate group, MIN(C.value); > dump E; > {code} > {code:title=polisan/input1.pig} > 1 [x#1,y#ab] > 1 [x#2,y#cd] > {code} > then run the pig, I got exception as following: > {noformat} > 2014-05-15 19:44:52,944 [Thread-2] WARN > org.apache.hadoop.mapred.LocalJobRunner - job_local_0001 > org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception > while executing (Name: D: Local Rearrange[tuple]{tuple}(false) - scope-42 > Operator Key: scope-42): > org.apache.pig.backend.executionengine.ExecException: ERROR 2106: Error while > computing min in Initial > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:289) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:263) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:1) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2106: > Error while computing min in Initial > at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:81) > at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:1) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:352) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:391) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:281) > ... 8 more > Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray > cannot be cast to java.lang.String > at org.apache.pig.builtin.StringMin$Initial.exec(StringMin.java:73) > ... 15 more > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)