I've opened HIVE-10160 <https://issues.apache.org/jira/browse/HIVE-10160>:
 Give a warning when grouping or ordering by a constant column.

Thanks Gopal.

-- Lefty

On Tue, Mar 31, 2015 at 2:14 AM, Lefty Leverenz <leftylever...@gmail.com>
wrote:

> ---------- Forwarded message ----------
> From: Lefty Leverenz <leftylever...@gmail.com>
> Date: Tue, Mar 31, 2015 at 1:47 AM
> Subject: Re: ORDER BY clause in Hive
> To: Mich Talebzadeh <m...@peridale.co.uk>
>
>
> Hive as I see it does not support ORDER BY *Column position*. It only
>> supports ORDER BY *Column name*.
>>
>
> That's just in Hive release 0.10.0 and earlier.  In release 0.11.0+ you
> can set the configuration parameter
> <https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-ConfiguringHive>
>  *hive.groupby.orderby.position.**alias* to true, and then you can use
> column positions in ORDER BY.
>
> Here's the new documentation:
>
>    - hive.groupby.orderby.position.alias
>    
> <https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.groupby.orderby.position.alias>
>    - Order By
>    
> <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy#LanguageManualSortBy-SyntaxofOrderBy>
>
>    - Group By
>    
> <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy#LanguageManualGroupBy-GroupBySyntax>
>
>
> -- Lefty
>
> On Mon, Mar 30, 2015 at 4:46 AM, Mich Talebzadeh <m...@peridale.co.uk>
> wrote:
>
>> Gents,
>>
>>
>>
>> Hive as I see it does not support ORDER BY *Column position*. It only
>> supports ORDER BY *Column name*.
>>
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>> Mich Talebzadeh
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> *Publications due shortly:*
>>
>> *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen
>> and Coherence Cache*
>>
>>
>>
>> NOTE: The information in this email is proprietary and confidential. This
>> message is for the designated recipient only, if you are not the intended
>> recipient, you should destroy it immediately. Any information in this
>> message shall not be understood as given or endorsed by Peridale Ltd, its
>> subsidiaries or their employees, unless expressly so stated. It is the
>> responsibility of the recipient to ensure that this email is virus free,
>> therefore neither Peridale Ltd, its subsidiaries nor their employees accept
>> any responsibility.
>>
>>
>>
>> *From:* Gopal Vijayaraghavan [mailto:go...@hortonworks.com] *On Behalf
>> Of *Gopal Vijayaraghavan
>> *Sent:* 30 March 2015 05:26
>> *To:* user@hive.apache.org
>> *Cc:* Lefty Leverenz
>>
>> *Subject:* Re: ORDER BY clause in Hive
>>
>>
>>
>> Hi Lefty,
>>
>>
>>
>> Couldn’t find the documentation for what
>> hive.groupby.orderby.position.alias=true does.
>>
>>
>>
>> I suspect that might be what Mich was looking for (though I tend to write
>> the column names explicitly).
>>
>>
>>
>> Cheers,
>>
>> Gopal
>>
>>
>>
>> *From: *Lefty Leverenz <leftylever...@gmail.com>
>> *Reply-To: *"user@hive.apache.org" <user@hive.apache.org>
>> *Date: *Sunday, March 29, 2015 at 8:32 PM
>> *To: *"user@hive.apache.org" <user@hive.apache.org>
>> *Subject: *Re: ORDER BY clause in Hive
>>
>>
>>
>> I added information about this in the Order By
>> <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy#LanguageManualSortBy-SyntaxofOrderBy>
>> section of the wiki.  Thanks, Mich and Gopal!
>>
>>
>> -- Lefty
>>
>>
>>
>> On Sun, Mar 29, 2015 at 5:24 AM, Mich Talebzadeh <m...@peridale.co.uk>
>> wrote:
>>
>> Hi Lefty, Gopal,
>>
>>
>>
>> It appears that ORDER BY 3 is not interpreted as ORDERR BY “the result
>> set column three” which standard SQL as evident from Oracle and Sybase does.
>>
>>
>>
>> So I made it an ORDER BY from the result set EXPLICITELY as shown below
>> and it worked OK
>>
>>
>>
>> SELECT
>>
>>           rs.Customer_ID
>>
>>         , rs.Number_of_orders
>>
>>         , rs.Total_customer_amount
>>
>>         , rs.Average_order
>>
>>         , rs.Standard_deviation
>>
>> FROM
>>
>> (
>>
>>         SELECT cust_id AS Customer_ID,
>>
>>         COUNT(amount_sold) AS Number_of_orders,
>>
>>         SUM(amount_sold) AS Total_customer_amount,
>>
>>         AVG(amount_sold) AS Average_order,
>>
>>         STDDEV(amount_sold) AS Standard_deviation
>>
>>         FROM sales
>>
>>         GROUP BY cust_id
>>
>>         HAVING SUM(amount_sold) > 94000
>>
>>         AND AVG(amount_sold) < STDDEV(amount_sold)
>>
>> ) rs
>>
>> ORDER BY
>>
>>           *rs.Total_customer_amount*
>>
>> ;
>>
>>
>>
>>
>>
>>
>> +-----------------+----------------------+---------------------------+---------------------+------------------------+--+
>>
>> | rs.customer_id  | rs.number_of_orders  | rs.total_customer_amount  |
>> rs.average_order   | rs.standard_deviation  |
>>
>>
>> +-----------------+----------------------+---------------------------+---------------------+------------------------+--+
>>
>> | 1743.0          | 238                  | 94786.12999999993         |
>> 398.2610504201678   | 581.0439095219863      |
>>
>> | 429.0           | 231                  | 94819.41000000006         |
>> 410.4736363636366   | 613.7057080691426      |
>>
>> | 2994.0          | 227                  | 94862.61000000006         |
>> 417.89696035242315  | 623.1607772763742      |
>>
>> | 6395.0          | 268                  | 97010.47999999998         |
>> 361.97940298507456  | 576.9120977984521      |
>>
>> | 12783.0         | 240                  | 97573.54999999996         |
>> 406.5564583333332   | 590.4445500393804      |
>>
>> | 4974.0          | 235                  | 98006.16000000002         |
>> 417.0474893617022   | 624.337482834059       |
>>
>> | 42167.0         | 266                  | 98585.96000000002         |
>> 370.6239097744362   | 590.965120684093       |
>>
>> | 10747.0         | 256                  | 99578.08999999997         |
>> 388.9769140624999   | 600.7615005975689      |
>>
>> | 11407.0         | 248                  | 103412.65999999995        |
>> 416.9865322580643   | 622.221465710723       |
>>
>>
>> +-----------------+----------------------+---------------------------+---------------------+------------------------+--+
>>
>> 9 rows selected (209.699 seconds)
>>
>>
>>
>>
>>
>> Regards,
>>
>>
>>
>> Mich Talebzadeh
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> *Publications due shortly:*
>>
>> *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen
>> and Coherence Cache*
>>
>>
>>
>> NOTE: The information in this email is proprietary and confidential. This
>> message is for the designated recipient only, if you are not the intended
>> recipient, you should destroy it immediately. Any information in this
>> message shall not be understood as given or endorsed by Peridale Ltd, its
>> subsidiaries or their employees, unless expressly so stated. It is the
>> responsibility of the recipient to ensure that this email is virus free,
>> therefore neither Peridale Ltd, its subsidiaries nor their employees accept
>> any responsibility.
>>
>>
>>
>> *From:* Mich Talebzadeh [mailto:m...@peridale.co.uk]
>> *Sent:* 29 March 2015 00:11
>> *To:* user@hive.apache.org
>> *Subject:* ORDER BY clause in Hive
>>
>>
>>
>> Hi,
>>
>>
>>
>> Can someone point me to doc or otherwise to see if ORDER BY clause in
>> Hive is working OK
>>
>>
>>
>> I have a simple aggregate query as follows:
>>
>>
>>
>> SELECT cust_id AS Customer_ID,
>>
>> COUNT(amount_sold) AS Number_of_orders,
>>
>> SUM(amount_sold) AS Total_customer_amount,
>>
>> AVG(amount_sold) AS Average_order,
>>
>> STDDEV(amount_sold) AS Standard_deviation
>>
>> FROM sales
>>
>> GROUP BY cust_id
>>
>> HAVING SUM(amount_sold) > 94000
>>
>> AND AVG(amount_sold) < STDDEV(amount_sold)
>>
>> ORDER BY 3 ;
>>
>>
>>
>> The original table and data are from Oracle sh.sales table
>>
>>
>>
>> Oracle comes back for this query with
>>
>>
>>
>> CUSTOMER_ID NUMBER_OF_ORDERS TOTAL_CUSTOMER_AMOUNT AVERAGE_ORDER
>> STANDARD_DEVIATION
>>
>> ----------- ---------------- --------------------- -------------
>> ------------------
>>
>>        1743              238              94786.13     398.26105
>> 582.26845
>>
>>         429              231              94819.41    410.473636
>> 615.038404
>>
>>        2994              227              94862.61     417.89696
>> 624.53793
>>
>>        6395              268              97010.48    361.979403
>> 577.991448
>>
>>       12783              240              97573.55
>> 406.556458           591.6785
>>
>>        4974              235              98006.16    417.047489
>> 625.670115
>>
>>       42167              266              98585.96     370.62391
>> 592.079099
>>
>>       10747              256              99578.09    388.976914
>> 601.938312
>>
>>       11407              248             103412.66    416.986532
>> 623.479751
>>
>>
>>
>> 9 rows selected.
>>
>>
>>
>> Ordered by TOTAL_CUSTOMER_AMOUNT
>>
>>
>>
>> And hive returns for the same query
>>
>>
>>
>>
>> +--------------+-------------------+------------------------+---------------------+---------------------+--+
>>
>> | customer_id  | number_of_orders  | total_customer_amount  |
>> average_order    | standard_deviation  |
>>
>>
>> +--------------+-------------------+------------------------+---------------------+---------------------+--+
>>
>> | 42167.0      | 266               | 98585.96000000002      |
>> 370.6239097744362   | 590.965120684093    |
>>
>> | 12783.0      | 240               | 97573.54999999996      |
>> 406.5564583333332   | 590.4445500393804   |
>>
>> | 11407.0      | 248               | 103412.65999999995     |
>> 416.9865322580643   | 622.221465710723    |
>>
>> | 10747.0      | 256               | 99578.08999999997      |
>> 388.9769140624999   | 600.7615005975689   |
>>
>> | 6395.0       | 268               | 97010.47999999998      |
>> 361.97940298507456  | 576.9120977984521   |
>>
>> | 4974.0       | 235               | 98006.16000000002      |
>> 417.0474893617022   | 624.337482834059    |
>>
>> | 2994.0       | 227               | 94862.61000000006      |
>> 417.89696035242315  | 623.1607772763742   |
>>
>> | 1743.0       | 238               | 94786.12999999993      |
>> 398.2610504201678   | 581.0439095219863   |
>>
>> | 429.0        | 231               | 94819.41000000006      |
>> 410.4736363636366   | 613.7057080691426   |
>>
>>
>> +--------------+-------------------+------------------------+---------------------+---------------------+--+
>>
>> 9 rows selected (215.774 seconds)
>>
>>
>>
>> But ordering in Hive does not seem to be correct! Please note ordering is
>> on column three, *total_customer_amount*
>>
>>
>>
>> I also tried this in Sybase and got the same as Oracle.
>>
>>
>>
>> Adaptive Server cpu time: 100 ms.
>>
>> Customer_ID          Number_of_orders
>> Total_customer_amount
>> Average_order                                        Standard_deviation
>>
>> -------------------- ----------------
>> -----------------------------------------
>> ----------------------------------------------------
>> ---------------------------
>>
>>                 1743              238
>> 94786.13
>> 398.2610504201680                  582.268450
>>
>>                   429              231
>> 94819.41
>>                      410.4736363636363                  615.038404
>>
>>                  2994              227
>> 94862.61
>> 417.8969603524229                  624.537930
>>
>>                  6395              268
>> 97010.48
>> 361.9794029850746                  577.991448
>>
>>                 12783              240
>> 97573.55
>> 406.5564583333333                  591.678500
>>
>>                  4974              235
>> 98006.16
>> 417.0474893617021                  625.670115
>>
>>                 42167              266
>>                    98585.96
>> 370.6239097744360                  592.079099
>>
>>                 10747              256
>> 99578.09                                    388.9769140625000
>>     601.938312
>>
>>                 11407              248
>> 103412.66
>> 416.9865322580645                  623.479751
>>
>>
>>
>> I tried Google search and seems to be different suggestions. May be I
>> have to rewrite the code?
>>
>>
>>
>> Thanks
>>
>>
>>
>> Mich Talebzadeh
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> *Publications due shortly:*
>>
>> *Creating in-memory Data Grid for Trading Systems with Oracle TimesTen
>> and Coherence Cache*
>>
>>
>>
>> NOTE: The information in this email is proprietary and confidential. This
>> message is for the designated recipient only, if you are not the intended
>> recipient, you should destroy it immediately. Any information in this
>> message shall not be understood as given or endorsed by Peridale Ltd, its
>> subsidiaries or their employees, unless expressly so stated. It is the
>> responsibility of the recipient to ensure that this email is virus free,
>> therefore neither Peridale Ltd, its subsidiaries nor their employees accept
>> any responsibility.
>>
>>
>>
>>
>>
>
>
>

Reply via email to