Tajul Bashar created HIVE-13848:
-----------------------------------
Summary: Hive SORT/ORDER BY regex_extract(expression) alias column
does not work
Key: HIVE-13848
URL: https://issues.apache.org/jira/browse/HIVE-13848
Project: Hive
Issue Type: Bug
Components: Hive
Affects Versions: 2.0.0
Environment: Fedora Linux
Reporter: Tajul Bashar
Example column values:
-----------------------
b>$29</b> per month. In addition you must keep paying your Medicare Part B
premium.
Additional <b>$30.90</b> per month. You must keep paying your Medicare Part B
premium and your <b>$29</b> monthly plan premium.
<b>$59</b> per month. In addition you must keep paying your Medicare Part B
premium.
<b>$29</b> per month. In addition you must keep paying your Medicare Part B
premium.
-------------------------------
Query without SORT or ORDER BY:
hive> select CAST(regexp_extract(benefit, '\$?(\\d+)', 1) AS FLOAT) as premium
from planservices where benefit like '%premium%' and benefit like '%<b>%</b>%'
limit 10;
OK
0.0
15.0
0.0
15.0
0.0
18.0
15.0
0.0
15.0
19.0
Time taken: 0.153 seconds, Fetched: 10 row(s)
-----------------------------
Query with SORT or ORDER BY:
select CAST(regexp_extract(benefit, '\$?(\\d+)', 1) AS FLOAT) as premium from
planservices where benefit like '%premium%' and benefit like '%<b>%</b>%' SORT
BY premium limit 10;
OK
NULL
NULL
NULL
NULL
NULL
NULL
NULL
0.0
0.0
0.0
Time taken: 4.106 seconds, Fetched: 10 row(s)
------
The result is same irrespective of reducer counts set to 1 or more and whether
query is SORT BY or ORDER BY [ running on Hive-on-MR ].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)