[jira] [Commented] (HIVE-7817) distinct/group by don't work on partition columns

Pengcheng Xiong (JIRA) Thu, 04 Dec 2014 17:17:40 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-7817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234880#comment-14234880
 ]


Pengcheng Xiong commented on HIVE-7817:
---------------------------------------

by the way, i do not think hive-3108 is ever solved.

> distinct/group by don't work on partition columns
> -------------------------------------------------
>
>                 Key: HIVE-7817
>                 URL: https://issues.apache.org/jira/browse/HIVE-7817
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.14.0
>            Reporter: Eugene Koifman
>
> suppose you have a table like this:
> {code:sql}
> CREATE TABLE page_view(
>        viewTime INT,
>        userid BIGINT,
>         page_url STRING,
>         referrer_url STRING,
>         ip STRING COMMENT 'IP Address of the User')
> COMMENT 'This is the page view table'
> PARTITIONED BY(dt STRING, country STRING)
> CLUSTERED BY(userid) INTO 4 BUCKETS
> {code}
> Then 
> {code:sql}
> select distinct dt from page_view;
> select distinct dt, country from page_view;
> select dt, country from page_view group by dt, country;
> {code}
> all fail with
> {noformat}
> Query ID = ekoifman_20140820172626_b03ba819-c111-433f-a3fc-453c7d5a3e86
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Job running in-process (local Hadoop)
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0
> 2014-08-20 17:26:13,018 Stage-1 map = 0%,  reduce = 0%
> Ended Job = job_local165359429_0013 with errors
> Error during job, obtaining debugging information...
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
> {noformat}
> but 
> {code:sql}
> select dt, country, count(*) from page_view group by dt, country;
> {code}
> works fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-7817) distinct/group by don't work on partition columns

Reply via email to