[jira] [Updated] (HIVE-21119) String UDAF and count distinct in the same select give error

Ravi Shetye (JIRA) Tue, 15 Jan 2019 09:53:52 -0800


     [ 
https://issues.apache.org/jira/browse/HIVE-21119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ravi Shetye updated HIVE-21119:
-------------------------------
    Labels: plannin  (was: wrongresults)

> String UDAF and count distinct in the same select give error
> ------------------------------------------------------------
>
>                 Key: HIVE-21119
>                 URL: https://issues.apache.org/jira/browse/HIVE-21119
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Ravi Shetye
>            Priority: Major
>              Labels: plannin
>         Attachments: MaxUDA.java, run.log
>
>
> With the attached UDAF the following query crashes on hive.
> CRASHES
> {noformat}
> select rs_max(genderkey),count(distinct genderkey) from 
> as_adventure.dimgender;
> {noformat}
> WORKS
> {noformat}
> select rs_max(genderkey) from as_adventure.dimgender;
> {noformat}
> The table looks like
> {noformat}
> 0: jdbc:hive2://localhost:10000> select * from dimgender;
> OK
> INFO  : Compiling 
> command(queryId=hive_20190111225125_486e6e6b-97fa-4dda-9688-a733180bcfe7): 
> select * from dimgender
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:dimgender.genderkey, type:string, 
> comment:null), FieldSchema(name:dimgender.gendername, type:string, 
> comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20190111225125_486e6e6b-97fa-4dda-9688-a733180bcfe7); 
> Time taken: 0.2 seconds
> INFO  : Concurrency mode is disabled, not creating a lock manager
> INFO  : Executing 
> command(queryId=hive_20190111225125_486e6e6b-97fa-4dda-9688-a733180bcfe7): 
> select * from dimgender
> INFO  : Completed executing 
> command(queryId=hive_20190111225125_486e6e6b-97fa-4dda-9688-a733180bcfe7); 
> Time taken: 0.004 seconds
> INFO  : OK
> INFO  : Concurrency mode is disabled, not creating a lock manager
> +----------------------+-----------------------+
> | dimgender.genderkey  | dimgender.gendername  |
> +----------------------+-----------------------+
> | M                    | Male                  |
> | F                    | Female                |
> | U                    | Unisex                |
> +----------------------+-----------------------+
> {noformat}
> {noformat}
> Vertex failed, vertexName=Reducer 2, vertexId=vertex_1547169244949_0024_2_01, 
> diagnostics=[Task failed, taskId=task_1547169244949_0024_2_01_000000, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1547169244949_0024_2_01_000000_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row (tag=0) 
> {"key":{"_col0":"F"},"value":{"_col0":"F"}}
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>       at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>       at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>       at java.security.AccessController.doPrivileged(Native Method)
> {noformat}
> ...
> {noformat}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to 
> execute method public boolean 
> com.sample.MaxUDA$Evaluator.merge(java.lang.String) with arguments 
> {F}:argument type mismatch
>       at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:1111)
>       at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.merge(GenericUDAFBridge.java:176)
>       at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:216)
> {noformat}
> PLAN
> {noformat}
> +----------------------------------------------------+
> |                      Explain                       |
> +----------------------------------------------------+
> | Plan optimized by CBO.                             |
> |                                                    |
> | Vertex dependency in root stage                    |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)                   |
> | Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE)        |
> |                                                    |
> | Stage-0                                            |
> |   Fetch Operator                                   |
> |     limit:-1                                       |
> |     Stage-1                                        |
> |       Reducer 3                                    |
> |       File Output Operator [FS_6]                  |
> |         Group By Operator [GBY_12] (rows=1 width=368) |
> |           
> Output:["_col0","_col1"],aggregations:["rs_max(VALUE._col0)","count(VALUE._col1)"]
>  |
> |         <-Reducer 2 [CUSTOM_SIMPLE_EDGE]           |
> |           PARTITION_ONLY_SHUFFLE [RS_11]           |
> |             Group By Operator [GBY_10] (rows=1 width=368) |
> |               
> Output:["_col0","_col1"],aggregations:["rs_max(_col1)","count(_col0)"] |
> |               Group By Operator [GBY_9] (rows=3 width=2) |
> |                 
> Output:["_col0","_col1"],aggregations:["rs_max(VALUE._col0)"],keys:KEY._col0 |
> |               <-Map 1 [SIMPLE_EDGE]                |
> |                 SHUFFLE [RS_8]                     |
> |                   PartitionCols:_col0              |
> |                   Group By Operator [GBY_7] (rows=3 width=2) |
> |                     
> Output:["_col0","_col1"],aggregations:["rs_max(genderkey)"],keys:genderkey |
> |                     Select Operator [SEL_1] (rows=3 width=2) |
> |                       Output:["genderkey"]         |
> |                       TableScan [TS_0] (rows=3 width=2) |
> |                         
> as_adventure@dimgender,dimgender,Tbl:COMPLETE,Col:NONE,Output:["genderkey"] |
> |                                                    |
> +----------------------------------------------------+
> 30 rows selected (0.3 seconds)
> 0: jdbc:hive2://localhost:10000> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21119) String UDAF and count distinct in the same select give error

Reply via email to