[jira] [Updated] (HIVE-3762) Minor fix for 'tableName' in Hive.g

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3762:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed. Thanks Navis

> Minor fix for 'tableName' in Hive.g
> ---
>
> Key: HIVE-3762
> URL: https://issues.apache.org/jira/browse/HIVE-3762
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3762.D7143.1.patch
>
>
> Current definition for 'tableName' is "(db=Identifier DOT)? tab=Identifier". 
> If user specifies value "default." for it, hive parser accepts "default" as 
> table name and reserves "." for next token but it's not valid.
> Really trivial but it is small needed part for improving query 
> auto-completion (I'm doing it).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3774) Sort merge join should work if join cols are a prefix of sort columns for each partition

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511234#comment-13511234
 ] 

Namit Jain commented on HIVE-3774:
--

https://reviews.facebook.net/D7179

> Sort merge join should work if join cols are a prefix of sort columns for 
> each partition
> 
>
> Key: HIVE-3774
> URL: https://issues.apache.org/jira/browse/HIVE-3774
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3774.1.patch
>
>
> Currently, a join is converted into a sort-merge join only if the join cols 
> exactly matches the sort cols.
> This constraint can definitely be relaxed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3774) Sort merge join should work if join cols are a prefix of sort columns for each partition

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3774:
-

Attachment: hive.3774.1.patch

> Sort merge join should work if join cols are a prefix of sort columns for 
> each partition
> 
>
> Key: HIVE-3774
> URL: https://issues.apache.org/jira/browse/HIVE-3774
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3774.1.patch
>
>
> Currently, a join is converted into a sort-merge join only if the join cols 
> exactly matches the sort cols.
> This constraint can definitely be relaxed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3774) Sort merge join should work if join cols are a prefix of sort columns for each partition

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3774:
-

Status: Patch Available  (was: Open)

> Sort merge join should work if join cols are a prefix of sort columns for 
> each partition
> 
>
> Key: HIVE-3774
> URL: https://issues.apache.org/jira/browse/HIVE-3774
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3774.1.patch
>
>
> Currently, a join is converted into a sort-merge join only if the join cols 
> exactly matches the sort cols.
> This constraint can definitely be relaxed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3775) Unit test failures due to unspecified order of results in "show grant" command

2012-12-06 Thread Gunther Hagleitner (JIRA)
Gunther Hagleitner created HIVE-3775:


 Summary: Unit test failures due to unspecified order of results in 
"show grant" command
 Key: HIVE-3775
 URL: https://issues.apache.org/jira/browse/HIVE-3775
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner


A number of unit tests (sometimes) using "show grant" fail, when run on windows 
or previous failures put the database in an unexpected state.

The reason is that the output of "show grant" is not specified to be in any 
particular order, but the golden files expect it to be.

The unit test framework should be extended to handled cases like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3775) Unit test failures due to unspecified order of results in "show grant" command

2012-12-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-3775:
-

Status: Patch Available  (was: Open)

> Unit test failures due to unspecified order of results in "show grant" command
> --
>
> Key: HIVE-3775
> URL: https://issues.apache.org/jira/browse/HIVE-3775
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-3775.1-r1417768.patch
>
>
> A number of unit tests (sometimes) using "show grant" fail, when run on 
> windows or previous failures put the database in an unexpected state.
> The reason is that the output of "show grant" is not specified to be in any 
> particular order, but the golden files expect it to be.
> The unit test framework should be extended to handled cases like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3775) Unit test failures due to unspecified order of results in "show grant" command

2012-12-06 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-3775:
-

Attachment: HIVE-3775.1-r1417768.patch

Patch adds option to let you specify "-- SORT_BEFORE_DIFF" in query files. 
These files will be sorted before the diff is run. All auth testcases have been 
changed to add this comment.

> Unit test failures due to unspecified order of results in "show grant" command
> --
>
> Key: HIVE-3775
> URL: https://issues.apache.org/jira/browse/HIVE-3775
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-3775.1-r1417768.patch
>
>
> A number of unit tests (sometimes) using "show grant" fail, when run on 
> windows or previous failures put the database in an unexpected state.
> The reason is that the output of "show grant" is not specified to be in any 
> particular order, but the golden files expect it to be.
> The unit test framework should be extended to handled cases like that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3140) Comment indenting is broken for "describe" in CLI

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3140:
-

Status: Open  (was: Patch Available)

minor comments on phabricator

> Comment indenting is broken for "describe" in CLI
> -
>
> Key: HIVE-3140
> URL: https://issues.apache.org/jira/browse/HIVE-3140
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Xiaoxiao Hou
>Assignee: Zhenxiao Luo
>  Labels: patch
> Fix For: 0.10.0
>
> Attachments: HIVE-3140.1.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Just go into the CLI and type "describe [TABLE_NAME]". If a comment has 
> multiple lines, it is completely unreadable due to poor comment indenting. 
> For example:
> birthdayParam string 1 = comment1
> 2 = comment2
> 3 = comment3
> But it supposed to display as:
> birthdayParam string 1 = comment1
>  2 = comment2
>  3 = comment3
> Comments should be indented the same amount on each line, i.e., if the 
> comment starts at row k for the first line of the comment, it should be 
> indented by k on line 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3776) support PIVOT in hive

2012-12-06 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3776:


 Summary: support PIVOT in hive
 Key: HIVE-3776
 URL: https://issues.apache.org/jira/browse/HIVE-3776
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain


It is a fairly well understood feature in databases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2477) Use name of original expression for name of CAST output

2012-12-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2477:


Status: Patch Available  (was: Open)

> Use name of original expression for name of CAST output
> ---
>
> Key: HIVE-2477
> URL: https://issues.apache.org/jira/browse/HIVE-2477
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adam Kramer
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-2477.1.patch.txt, HIVE-2477.D7161.1.patch, 
> HIVE-2477.D7161.2.patch
>
>
> CAST(foo AS INT)
> should, by default, consider itself a column named foo if 
> unspecified/unaliased.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2723) should throw "Ambiguous column reference key" Exception in particular join condition

2012-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511257#comment-13511257
 ] 

Hudson commented on HIVE-2723:
--

Integrated in Hive-trunk-h0.21 #1836 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1836/])
HIVE-2723 should throw "Ambiguous column reference key" Exception in 
particular
join condition (Navis via namit) (Revision 1417743)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1417743
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/RowResolver.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientnegative/ambiguous_col0.q
* /hive/trunk/ql/src/test/queries/clientnegative/ambiguous_col1.q
* /hive/trunk/ql/src/test/queries/clientnegative/ambiguous_col2.q
* /hive/trunk/ql/src/test/queries/clientpositive/ambiguous_col.q
* /hive/trunk/ql/src/test/results/clientnegative/ambiguous_col0.q.out
* /hive/trunk/ql/src/test/results/clientnegative/ambiguous_col1.q.out
* /hive/trunk/ql/src/test/results/clientnegative/ambiguous_col2.q.out
* /hive/trunk/ql/src/test/results/clientnegative/ambiguous_col_patterned.q.out
* /hive/trunk/ql/src/test/results/clientpositive/ambiguous_col.q.out


> should throw  "Ambiguous column reference key"  Exception in particular join 
> condition
> --
>
> Key: HIVE-2723
> URL: https://issues.apache.org/jira/browse/HIVE-2723
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
> Environment: Linux zongren-VirtualBox 3.0.0-14-generic #23-Ubuntu SMP 
> Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux
> java version "1.6.0_25"
> hadoop-0.20.2-cdh3u0
> hive-0.7.0-cdh3u0
>Reporter: caofangkun
>Assignee: Navis
>Priority: Minor
>  Labels: exception-handling, query, queryparser
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2723.D1275.1.patch, 
> ASF.LICENSE.NOT.GRANTED--HIVE-2723.D1275.2.patch, hive.2723.1.patch, 
> HIVE-2723.D1275.3.patch
>
>
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3594) When Group by Partition Column Type is Timestamp or STRING Which Format contains "HH:MM:SS", It will occur URISyntaxException

2012-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511258#comment-13511258
 ] 

Hudson commented on HIVE-3594:
--

Integrated in Hive-trunk-h0.21 #1836 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1836/])
HIVE-3594 When Group by Partition Column Type is Timestamp or STRING Which 
Format contains "HH:MM:SS",
It will occur URISyntaxException (Navis via namit) (Revision 1417741)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1417741
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java
* /hive/trunk/ql/src/test/queries/clientpositive/metadataonly1.q
* /hive/trunk/ql/src/test/results/clientpositive/metadataonly1.q.out


> When Group by Partition Column Type is Timestamp or STRING Which Format 
> contains "HH:MM:SS", It will occur URISyntaxException
> -
>
> Key: HIVE-3594
> URL: https://issues.apache.org/jira/browse/HIVE-3594
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Daisy.Yuan
>Assignee: Navis
> Attachments: HIVE-3594.D6081.1.patch, HIVE-3594.D6081.2.patch
>
>
> create table test (no int, name string) partitioned by (pts string) row 
> format delimited fields terminated by ' '; 
> load data local inpath '/opt/files/groupbyts1.txt' into table test 
> partition(pts='12:11:30');
> load data local inpath '/opt/files/groupbyts2.txt' into table test 
> partition(pts='21:25:12');
> load data local inpath '/opt/files/groupbyts3.txt' into table test 
> partition(pts='12:11:30');
> load data local inpath '/opt/files/groupbyts4.txt' into table test 
> partition(pts='21:25:12');
> when I execute “select * from test group by pts;”, it will occur as follows 
> exception.
>  at org.apache.hadoop.fs.Path.initialize(Path.java:157)
> at org.apache.hadoop.fs.Path.(Path.java:135)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.getInputSummary(Utilities.java:1667)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.estimateNumberOfReducers(MapRedTask.java:432)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.setNumberOfReducers(MapRedTask.java:400)
> at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:93)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:135)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1329)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1121)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:954)
> at 
> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
> at 
> org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:630)
> at 
> org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:618)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
> fake-path-metadata-only-query-default.test{pts=12:11:30%7D
> at java.net.URI.checkPath(URI.java:1788)
> at java.net.URI.(URI.java:734)
> at org.apache.hadoop.fs.Path.initialize(Path.java:154)
> ... 19 more
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> When PhysicalOptimizer optimizes GroupByOperator, according to default 
> parameters "hive.optimize.metadataonly = true", MetadataOnlyOptimizer will be 
> enabled. The MetadataOnlyOptimizer will change the partition alias desc. The 
> partition alies "hdfs://ip:9000/user/hive/warehouse/test/pts=12%3A11%3A30" is 
> changed into " 
> fake-path-metadata-only-query-default.test{pts=12:11:30}". When construct uri 
> through new partition alies, it must occur java.net.URISyntaxException. 
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2477) Use name of original expression for name of CAST output

2012-12-06 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2477:
--

Attachment: HIVE-2477.D7161.2.patch

navis updated the revision "HIVE-2477 [jira] Use name of original expression 
for name of CAST output".
Reviewers: JIRA

  Addressed comments


REVISION DETAIL
  https://reviews.facebook.net/D7161

AFFECTED FILES
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
  ql/src/test/queries/clientpositive/alias_casted_column.q
  ql/src/test/results/clientpositive/alias_casted_column.q.out
  ql/src/test/results/compiler/plan/input_testxpath2.q.xml

To: JIRA, navis
Cc: njain


> Use name of original expression for name of CAST output
> ---
>
> Key: HIVE-2477
> URL: https://issues.apache.org/jira/browse/HIVE-2477
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adam Kramer
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-2477.1.patch.txt, HIVE-2477.D7161.1.patch, 
> HIVE-2477.D7161.2.patch
>
>
> CAST(foo AS INT)
> should, by default, consider itself a column named foo if 
> unspecified/unaliased.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2983) Hive ant targets for publishing maven artifacts can be simplified

2012-12-06 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2983:
--

Attachment: hive.2991.1.trunk.patch

> Hive ant targets for publishing maven artifacts can be simplified
> -
>
> Key: HIVE-2983
> URL: https://issues.apache.org/jira/browse/HIVE-2983
> Project: Hive
>  Issue Type: Improvement
>Reporter: Travis Crawford
>Assignee: Travis Crawford
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2983.D2961.1.patch, 
> hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
> hive.2991.1.trunk.patch
>
>
> Hive has a few ant tasks related to publishing maven artifacts. As not all 
> sub projects publish artifacts the {{iterate}} macro that simplifies other 
> tasks cannot be used in this context.
> Hive already uses the {{for}} task from ant-contrib, which works great here. 
> {{build.xml}} can be simplified by using the for task when preparing maven 
> artifacts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2983) Hive ant targets for publishing maven artifacts can be simplified

2012-12-06 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2983:
--

Attachment: hive.2991.1.branch-0.10.patch
hive.2991.1.branch-0.9.patch

> Hive ant targets for publishing maven artifacts can be simplified
> -
>
> Key: HIVE-2983
> URL: https://issues.apache.org/jira/browse/HIVE-2983
> Project: Hive
>  Issue Type: Improvement
>Reporter: Travis Crawford
>Assignee: Travis Crawford
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2983.D2961.1.patch, 
> hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
> hive.2991.1.trunk.patch
>
>
> Hive has a few ant tasks related to publishing maven artifacts. As not all 
> sub projects publish artifacts the {{iterate}} macro that simplifies other 
> tasks cannot be used in this context.
> Hive already uses the {{for}} task from ant-contrib, which works great here. 
> {{build.xml}} can be simplified by using the for task when preparing maven 
> artifacts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2477) Use name of original expression for name of CAST output

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2477:
-

Attachment: hive.2477.4.patch

> Use name of original expression for name of CAST output
> ---
>
> Key: HIVE-2477
> URL: https://issues.apache.org/jira/browse/HIVE-2477
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adam Kramer
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-2477.1.patch.txt, hive.2477.4.patch, 
> HIVE-2477.D7161.1.patch, HIVE-2477.D7161.2.patch
>
>
> CAST(foo AS INT)
> should, by default, consider itself a column named foo if 
> unspecified/unaliased.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2991) Integrate Clover with Hive

2012-12-06 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2991:
--

Attachment: hive.2991.1.trunk.patch

> Integrate Clover with Hive
> --
>
> Key: HIVE-2991
> URL: https://issues.apache.org/jira/browse/HIVE-2991
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, 
> hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
> hive.2991.1.trunk.patch
>
>
> Atlassian has donated license of their code coverage tool Clover to ASF. Lets 
> make use of it to generate code coverage report to figure out which areas of 
> Hive are well tested and which ones are not. More information about license 
> can be found in Hadoop jira HADOOP-1718 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2991) Integrate Clover with Hive

2012-12-06 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2991:
--

Attachment: hive.2991.1.branch-0.10.patch
hive.2991.1.branch-0.9.patch

> Integrate Clover with Hive
> --
>
> Key: HIVE-2991
> URL: https://issues.apache.org/jira/browse/HIVE-2991
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, 
> hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
> hive.2991.1.trunk.patch
>
>
> Atlassian has donated license of their code coverage tool Clover to ASF. Lets 
> make use of it to generate code coverage report to figure out which areas of 
> Hive are well tested and which ones are not. More information about license 
> can be found in Hadoop jira HADOOP-1718 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2983) Hive ant targets for publishing maven artifacts can be simplified

2012-12-06 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2983:
--

Attachment: (was: hive.2991.1.branch-0.10.patch)

> Hive ant targets for publishing maven artifacts can be simplified
> -
>
> Key: HIVE-2983
> URL: https://issues.apache.org/jira/browse/HIVE-2983
> Project: Hive
>  Issue Type: Improvement
>Reporter: Travis Crawford
>Assignee: Travis Crawford
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2983.D2961.1.patch
>
>
> Hive has a few ant tasks related to publishing maven artifacts. As not all 
> sub projects publish artifacts the {{iterate}} macro that simplifies other 
> tasks cannot be used in this context.
> Hive already uses the {{for}} task from ant-contrib, which works great here. 
> {{build.xml}} can be simplified by using the for task when preparing maven 
> artifacts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2991) Integrate Clover with Hive

2012-12-06 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2991:
--

Affects Version/s: 0.9.0
   Status: Patch Available  (was: Open)

> Integrate Clover with Hive
> --
>
> Key: HIVE-2991
> URL: https://issues.apache.org/jira/browse/HIVE-2991
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure
>Affects Versions: 0.9.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, 
> hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
> hive.2991.1.trunk.patch
>
>
> Atlassian has donated license of their code coverage tool Clover to ASF. Lets 
> make use of it to generate code coverage report to figure out which areas of 
> Hive are well tested and which ones are not. More information about license 
> can be found in Hadoop jira HADOOP-1718 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2983) Hive ant targets for publishing maven artifacts can be simplified

2012-12-06 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2983:
--

Attachment: (was: hive.2991.1.trunk.patch)

> Hive ant targets for publishing maven artifacts can be simplified
> -
>
> Key: HIVE-2983
> URL: https://issues.apache.org/jira/browse/HIVE-2983
> Project: Hive
>  Issue Type: Improvement
>Reporter: Travis Crawford
>Assignee: Travis Crawford
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2983.D2961.1.patch
>
>
> Hive has a few ant tasks related to publishing maven artifacts. As not all 
> sub projects publish artifacts the {{iterate}} macro that simplifies other 
> tasks cannot be used in this context.
> Hive already uses the {{for}} task from ant-contrib, which works great here. 
> {{build.xml}} can be simplified by using the for task when preparing maven 
> artifacts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2983) Hive ant targets for publishing maven artifacts can be simplified

2012-12-06 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2983:
--

Attachment: (was: hive.2991.1.branch-0.9.patch)

> Hive ant targets for publishing maven artifacts can be simplified
> -
>
> Key: HIVE-2983
> URL: https://issues.apache.org/jira/browse/HIVE-2983
> Project: Hive
>  Issue Type: Improvement
>Reporter: Travis Crawford
>Assignee: Travis Crawford
>Priority: Minor
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2983.D2961.1.patch
>
>
> Hive has a few ant tasks related to publishing maven artifacts. As not all 
> sub projects publish artifacts the {{iterate}} macro that simplifies other 
> tasks cannot be used in this context.
> Hive already uses the {{for}} task from ant-contrib, which works great here. 
> {{build.xml}} can be simplified by using the for task when preparing maven 
> artifacts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2477) Use name of original expression for name of CAST output

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511353#comment-13511353
 ] 

Namit Jain commented on HIVE-2477:
--

+1

> Use name of original expression for name of CAST output
> ---
>
> Key: HIVE-2477
> URL: https://issues.apache.org/jira/browse/HIVE-2477
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adam Kramer
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-2477.1.patch.txt, hive.2477.4.patch, 
> HIVE-2477.D7161.1.patch, HIVE-2477.D7161.2.patch
>
>
> CAST(foo AS INT)
> should, by default, consider itself a column named foo if 
> unspecified/unaliased.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3401) Diversify grammar for split sampling

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3401:
-

Status: Open  (was: Patch Available)

It is still not applying cleanly

> Diversify grammar for split sampling
> 
>
> Key: HIVE-3401
> URL: https://issues.apache.org/jira/browse/HIVE-3401
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
> some users wants to specify just the size of input. It can be easily 
> calculated with a few commands but it seemed good to support more grammars 
> something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511402#comment-13511402
 ] 

Namit Jain commented on HIVE-3562:
--

I think it would be simple to keep it a hive only change.
I agree that Hadoop change is more general, but it may be 
painful/time-consuming to backport.
In most of the cases, k would be small, and it makes perfect sense to put a 
limit on k, as you suggested above.

What do you think ?
Should we go with the hive-only change ?

> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3762) Minor fix for 'tableName' in Hive.g

2012-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511416#comment-13511416
 ] 

Hudson commented on HIVE-3762:
--

Integrated in Hive-trunk-h0.21 #1837 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1837/])
HIVE-3762 Minor fix for 'tableName' in Hive.g
(Navis via namit) (Revision 1417763)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1417763
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
* /hive/trunk/ql/src/test/results/clientnegative/invalid_tbl_name.q.out


> Minor fix for 'tableName' in Hive.g
> ---
>
> Key: HIVE-3762
> URL: https://issues.apache.org/jira/browse/HIVE-3762
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3762.D7143.1.patch
>
>
> Current definition for 'tableName' is "(db=Identifier DOT)? tab=Identifier". 
> If user specifies value "default." for it, hive parser accepts "default" as 
> table name and reserves "." for next token but it's not valid.
> Really trivial but it is small needed part for improving query 
> auto-completion (I'm doing it).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-06 Thread Sivaramakrishnan Narayanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511418#comment-13511418
 ] 

Sivaramakrishnan Narayanan commented on HIVE-3562:
--

Hive-only change is fine. A heap-based isTopN() implementation will be good.

> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511434#comment-13511434
 ] 

Namit Jain commented on HIVE-3562:
--

Cool, can you also review ?
I will start reviewing too.

> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3562:
-

Status: Open  (was: Patch Available)

comments on phabricator

> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-06 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511442#comment-13511442
 ] 

Phabricator commented on HIVE-3562:
---

njain has commented on the revision "HIVE-3562 [jira] Some limit can be pushed 
down to map stage".

  As per jira comments, can you add a new parameter for the limit on k ?

INLINE COMMENTS
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java:73 
Instead of checking for instanceof, can you add a method in Operator() - 
something like
  canLimitBePushed() with a lot of comments, and then the above operators can 
have this to true.
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java:358  Can 
you add a heap based implementation as suggested in jira ?
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java:86 Can you 
add lot of comments here - when is this set ?
  what queries will it benefit etc. ?
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:460 add these in 
hive-default.xml.template
  ql/src/test/queries/clientpositive/limit_pushdown.q:7 can you add another 
positive/negative test ?

  explain select value, sum(key) from src group by value limit 10;

  The limit should be used in the 2nd MR job if u r inserting into a table.
  ql/src/test/queries/clientpositive/limit_pushdown.q:10 For the 2 negative 
queries, can you insert into a table also -
  then the optm. should help

REVISION DETAIL
  https://reviews.facebook.net/D5967

To: JIRA, navis
Cc: njain


> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3672) Support altering partition column type in Hive

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3672:
-

Status: Open  (was: Patch Available)

comments on phabricator

> Support altering partition column type in Hive
> --
>
> Key: HIVE-3672
> URL: https://issues.apache.org/jira/browse/HIVE-3672
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, SQL
>Affects Versions: 0.10.0
>Reporter: Jingwei Lu
>Assignee: Jingwei Lu
> Fix For: 0.10.0
>
> Attachments: HIVE-3672.1.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Currently, Hive does not allow altering partition column types.  As we've 
> discouraged users from using non-string partition column types, this presents 
> a problem for users who want to change there partition columns to be strings, 
> they have to rename their table, create a new table, and copy all the data 
> over.
> To support this via the CLI, adding a command like ALTER TABLE  
> PARTITION COLUMN ( );

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-06 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511477#comment-13511477
 ] 

Phabricator commented on HIVE-3562:
---

tarball has requested changes to the revision "HIVE-3562 [jira] Some limit can 
be pushed down to map stage".

  Not sure if I'm following the right protocol here. Marking this as "Request 
Changes" per my comments.

INLINE COMMENTS
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:460 Minor 
suggestion, feel free to ignore :) Prefer parameter name  
"hive.limit.twophase.enable".
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java:129 Why 
not limit 0 (unless hive optimizes those away)?
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java:305 Is 
this variable not used at all?
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java:323 This 
code seems error prone. It has a side effect that's difficult to understand - 
it is not obvious that the caller must wipe out value before the next row is 
processed. Any particular reason we're caching it at all?
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java:346 If o1 
and o2 are null, shouldn't this return 0?

REVISION DETAIL
  https://reviews.facebook.net/D5967

BRANCH
  DPAL-1910

To: JIRA, tarball, navis
Cc: njain


> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage

2012-12-06 Thread Sivaramakrishnan Narayanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511482#comment-13511482
 ] 

Sivaramakrishnan Narayanan commented on HIVE-3562:
--

Added my comments to Phabricator. I'm new to these tools and the process. So, 
please forgive any mistakes on my part. 

Good stuff, Navis!

> Some limit can be pushed down to map stage
> --
>
> Key: HIVE-3562
> URL: https://issues.apache.org/jira/browse/HIVE-3562
> Project: Hive
>  Issue Type: Bug
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3562.D5967.1.patch
>
>
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> TS-SEL-RS-EXT-LIMIT-FS
> But LIMIT can be partially calculated in RS, reducing size of shuffling.
> TS-SEL-RS(TOP-N)-EXT-LIMIT-FS

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #220

2012-12-06 Thread Apache Jenkins Server
See 


--
[...truncated 9918 lines...]

compile-test:
 [echo] Project: serde
[javac] Compiling 26 source files to 

[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

create-dirs:
 [echo] Project: service
 [copy] Warning: 

 does not exist.

init:
 [echo] Project: service

ivy-init-settings:
 [echo] Project: service

ivy-resolve:
 [echo] Project: service
[ivy:resolve] :: loading settings :: file = 

[ivy:report] Processing 

 to 


ivy-retrieve:
 [echo] Project: service

compile:
 [echo] Project: service

ivy-resolve-test:
 [echo] Project: service

ivy-retrieve-test:
 [echo] Project: service

compile-test:
 [echo] Project: service
[javac] Compiling 2 source files to 


test:
 [echo] Project: hive

test-shims:
 [echo] Project: hive

test-conditions:
 [echo] Project: shims

gen-test:
 [echo] Project: shims

create-dirs:
 [echo] Project: shims
 [copy] Warning: 

 does not exist.

init:
 [echo] Project: shims

ivy-init-settings:
 [echo] Project: shims

ivy-resolve:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 

[ivy:report] Processing 

 to 


ivy-retrieve:
 [echo] Project: shims

compile:
 [echo] Project: shims
 [echo] Building shims 0.20

build_shims:
 [echo] Project: shims
 [echo] Compiling 

 against hadoop 0.20.2 
(

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 


ivy-retrieve-hadoop-shim:
 [echo] Project: shims
 [echo] Building shims 0.20S

build_shims:
 [echo] Project: shims
 [echo] Compiling 

 against hadoop 1.0.0 
(

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 


ivy-retrieve-hadoop-shim:
 [echo] Project: shims
 [echo] Building shims 0.23

build_shims:
 [echo] Project: shims
 [echo] Compiling 

 against hadoop 0.23.3 
(

[jira] [Updated] (HIVE-2477) Use name of original expression for name of CAST output

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2477:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed. Thanks Navis

> Use name of original expression for name of CAST output
> ---
>
> Key: HIVE-2477
> URL: https://issues.apache.org/jira/browse/HIVE-2477
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adam Kramer
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-2477.1.patch.txt, hive.2477.4.patch, 
> HIVE-2477.D7161.1.patch, HIVE-2477.D7161.2.patch
>
>
> CAST(foo AS INT)
> should, by default, consider itself a column named foo if 
> unspecified/unaliased.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3773) Share input scan by unions across multiple queries

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu reassigned HIVE-3773:
--

Assignee: Gang Tim Liu  (was: Namit Jain)

> Share input scan by unions across multiple queries
> --
>
> Key: HIVE-3773
> URL: https://issues.apache.org/jira/browse/HIVE-3773
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> Consider a query like:
> select * from
> (
>   select key, 1 as value, count(1) from src group by key
> union all
>   select 1 as key, value, count(1) from src group by value
> union all
>   select key, value, count(1) from src group by key, value
> ) s;
> src is scanned multiple times currently (one per sub-query).
> This should be treated like a multi-table insert by the optimizer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3485) HIve List Bucketing: Skewed DDL doesn't support skewed value with string quote

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3485:
---

Summary: HIve List Bucketing: Skewed DDL doesn't support skewed value with 
string quote  (was: Skewed DDL doesn't support skewed value with string quote)

> HIve List Bucketing: Skewed DDL doesn't support skewed value with string quote
> --
>
> Key: HIVE-3485
> URL: https://issues.apache.org/jira/browse/HIVE-3485
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>Priority: Minor
> Attachments: hive-3485.patch.1
>
>
> CREATE TABLE list_bucket_single (key STRING, value STRING) SKEWED BY (key) ON 
> ('1','5','6')
> Save '1' as in map instead 1 should

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3485) Hive List Bucketing - Skewed DDL doesn't support skewed value with string quote

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3485:
---

Summary: Hive List Bucketing - Skewed DDL doesn't support skewed value with 
string quote  (was: HIve List Bucketing: Skewed DDL doesn't support skewed 
value with string quote)

> Hive List Bucketing - Skewed DDL doesn't support skewed value with string 
> quote
> ---
>
> Key: HIVE-3485
> URL: https://issues.apache.org/jira/browse/HIVE-3485
> Project: Hive
>  Issue Type: Bug
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
>Priority: Minor
> Attachments: hive-3485.patch.1
>
>
> CREATE TABLE list_bucket_single (key STRING, value STRING) SKEWED BY (key) ON 
> ('1','5','6')
> Save '1' as in map instead 1 should

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3767:
---

Release Note: It seems like even after 
https://issues.apache.org/jira/browse/HIVE-3219 we still need to set 
hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for 
bucket map joins (though not SMB joins)

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3767:
---

Component/s: Query Processor

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3766) Enable adding hooks to hive meta store init

2012-12-06 Thread Jean Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean Xu updated HIVE-3766:
--

Attachment: jira3766.txt

> Enable adding hooks to hive meta store init
> ---
>
> Key: HIVE-3766
> URL: https://issues.apache.org/jira/browse/HIVE-3766
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Jean Xu
>Assignee: Jean Xu
> Attachments: jira3766.txt
>
>
> We will enable hooks to be added to init HMSHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3766) Enable adding hooks to hive meta store init

2012-12-06 Thread Jean Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean Xu updated HIVE-3766:
--

Status: Patch Available  (was: Open)

The patch jira3766 is attached

> Enable adding hooks to hive meta store init
> ---
>
> Key: HIVE-3766
> URL: https://issues.apache.org/jira/browse/HIVE-3766
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Jean Xu
>Assignee: Jean Xu
> Attachments: jira3766.txt
>
>
> We will enable hooks to be added to init HMSHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3766) Enable adding hooks to hive meta store init

2012-12-06 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13512085#comment-13512085
 ] 

Kevin Wilfong commented on HIVE-3766:
-

+1

> Enable adding hooks to hive meta store init
> ---
>
> Key: HIVE-3766
> URL: https://issues.apache.org/jira/browse/HIVE-3766
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Jean Xu
>Assignee: Jean Xu
> Attachments: jira3766.txt
>
>
> We will enable hooks to be added to init HMSHandler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3767:
---

 Description: It seems like even after 
https://issues.apache.org/jira/browse/HIVE-3219 we still need to set 
hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for 
bucket map joins (though not SMB joins)
Release Note:   (was: It seems like even after 
https://issues.apache.org/jira/browse/HIVE-3219 we still need to set 
hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for 
bucket map joins (though not SMB joins))

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> It seems like even after https://issues.apache.org/jira/browse/HIVE-3219 we 
> still need to set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for bucket map joins 
> (though not SMB joins)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #220

2012-12-06 Thread Apache Jenkins Server
See 

--
[...truncated 36473 lines...]
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2012-12-06_12-40-21_212_6872868338600461300/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201212061240_70365534.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/jenkins/hive_2012-12-06_12-40-25_217_4552613240785764133/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2012-12-06_12-40-25_217_4552613240785764133/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201212061240_26833716.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201212061240_467509295.txt
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201212061240_1196142689.txt
[junit] Copying file: 
file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: typ

[jira] [Commented] (HIVE-2439) Upgrade antlr version to 3.4

2012-12-06 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13512117#comment-13512117
 ] 

Thiruvel Thirumoolan commented on HIVE-2439:


There are two failures in TestCliDriver related to views which are a little 
weird (exception at the end). As per [1], during Semantic Analyzer, Hive 
expands a query [say replace with all columns if there is a *]. This is 
achieved using antlr TokenRewriteStream class. The inputs to this class from 
Hive hasn't changed because of the version bump (or updated Lexer/Parser), but 
TokenRewriteStream.toString() has undergone almost a rewrite when compared to 
3.0.1 when comparing the sources. The IllegalArgumentException is due to the 
additional code in 3.4.

This does not happen with all view related tests. Going by the test case 
failures, it happens when there is a join. Other test cases related to lateral 
views, UDF/UDTF all  pass, but two tests related to join fail [a view being 
created from join of tables]. Failed test cases: join_view.q and 
ppd_union_view.q. Not sure if there is a problem in the way we are using this 
API.

[1] - 
https://cwiki.apache.org/confluence/display/Hive/ViewDev#ViewDev-StoredViewDefinition

Exception:

...
create view v as select invites.bar, invites2.foo, invites2.ds from invites
join invites2 on invites.ds=invites2.ds
...
2012-12-04 10:59:19,647 DEBUG parse.SemanticAnalyzer
(SemanticAnalyzer.java:genPlan(6766)) - Created Plan for Query Block null
2012-12-04 10:59:19,648 ERROR ql.Driver (SessionState.java:printError(400)) -
FAILED: Hive Internal Error: java.lang.IllegalArgumentException(replace op
boundaries of
,5:101]..[@41,138:139='ds',<24>,5:101]:"`ds`">
overlap with previous
,5:93]..[@41,138:139='ds',<24>,5:101]:"`invites`.`ds`">)
java.lang.IllegalArgumentException: replace op boundaries of
,5:101]..[@41,138:139='ds',<24>,5:101]:"`ds`">
overlap with previous
,5:93]..[@41,138:139='ds',<24>,5:101]:"`invites`.`ds`">
at
org.antlr.runtime.TokenRewriteStream.reduceToSingleOperationPerIndex(TokenRewriteStream.java:504)
at
org.antlr.runtime.TokenRewriteStream.toString(TokenRewriteStream.java:374)
at
org.antlr.runtime.TokenRewriteStream.toString(TokenRewriteStream.java:358)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.saveViewDefinition(SemanticAnalyzer.java:7602)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7537)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:244)

> Upgrade antlr version to 3.4
> 
>
> Key: HIVE-2439
> URL: https://issues.apache.org/jira/browse/HIVE-2439
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Ashutosh Chauhan
> Attachments: HIVE-2439_branch9_2.patch, HIVE-2439_branch9.patch, 
> hive-2439_incomplete.patch
>
>
> Upgrade antlr version to 3.4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3140) Comment indenting is broken for "describe" in CLI

2012-12-06 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13514647#comment-13514647
 ] 

Zhenxiao Luo commented on HIVE-3140:


[~namit], thanks for the comments. I addressed them and patch resubmitted at:
https://reviews.facebook.net/D7173

> Comment indenting is broken for "describe" in CLI
> -
>
> Key: HIVE-3140
> URL: https://issues.apache.org/jira/browse/HIVE-3140
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Xiaoxiao Hou
>Assignee: Zhenxiao Luo
>  Labels: patch
> Fix For: 0.10.0
>
> Attachments: HIVE-3140.1.patch.txt, HIVE-3140.2.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Just go into the CLI and type "describe [TABLE_NAME]". If a comment has 
> multiple lines, it is completely unreadable due to poor comment indenting. 
> For example:
> birthdayParam string 1 = comment1
> 2 = comment2
> 3 = comment3
> But it supposed to display as:
> birthdayParam string 1 = comment1
>  2 = comment2
>  3 = comment3
> Comments should be indented the same amount on each line, i.e., if the 
> comment starts at row k for the first line of the comment, it should be 
> indented by k on line 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3140) Comment indenting is broken for "describe" in CLI

2012-12-06 Thread Zhenxiao Luo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-3140:
---

Attachment: HIVE-3140.2.patch.txt

> Comment indenting is broken for "describe" in CLI
> -
>
> Key: HIVE-3140
> URL: https://issues.apache.org/jira/browse/HIVE-3140
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Xiaoxiao Hou
>Assignee: Zhenxiao Luo
>  Labels: patch
> Fix For: 0.10.0
>
> Attachments: HIVE-3140.1.patch.txt, HIVE-3140.2.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Just go into the CLI and type "describe [TABLE_NAME]". If a comment has 
> multiple lines, it is completely unreadable due to poor comment indenting. 
> For example:
> birthdayParam string 1 = comment1
> 2 = comment2
> 3 = comment3
> But it supposed to display as:
> birthdayParam string 1 = comment1
>  2 = comment2
>  3 = comment3
> Comments should be indented the same amount on each line, i.e., if the 
> comment starts at row k for the first line of the comment, it should be 
> indented by k on line 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3140) Comment indenting is broken for "describe" in CLI

2012-12-06 Thread Zhenxiao Luo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-3140:
---

Status: Patch Available  (was: Open)

> Comment indenting is broken for "describe" in CLI
> -
>
> Key: HIVE-3140
> URL: https://issues.apache.org/jira/browse/HIVE-3140
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Xiaoxiao Hou
>Assignee: Zhenxiao Luo
>  Labels: patch
> Fix For: 0.10.0
>
> Attachments: HIVE-3140.1.patch.txt, HIVE-3140.2.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Just go into the CLI and type "describe [TABLE_NAME]". If a comment has 
> multiple lines, it is completely unreadable due to poor comment indenting. 
> For example:
> birthdayParam string 1 = comment1
> 2 = comment2
> 3 = comment3
> But it supposed to display as:
> birthdayParam string 1 = comment1
>  2 = comment2
>  3 = comment3
> Comments should be indented the same amount on each line, i.e., if the 
> comment starts at row k for the first line of the comment, it should be 
> indented by k on line 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3026) List Bucketing in Hive

2012-12-06 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13525881#comment-13525881
 ] 

Gang Tim Liu commented on HIVE-3026:


Since major patches have been committed, we will close the feature jira.

We have filed a few jiras for follow-up. We will track progress there 
separately.

thanks a lot

> List Bucketing in Hive
> --
>
> Key: HIVE-3026
> URL: https://issues.apache.org/jira/browse/HIVE-3026
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> Details are at:
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
> Please comment

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-3026) List Bucketing in Hive

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-3026 started by Gang Tim Liu.

> List Bucketing in Hive
> --
>
> Key: HIVE-3026
> URL: https://issues.apache.org/jira/browse/HIVE-3026
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> Details are at:
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
> Please comment

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-3026) List Bucketing in Hive

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu resolved HIVE-3026.


Resolution: Fixed

> List Bucketing in Hive
> --
>
> Key: HIVE-3026
> URL: https://issues.apache.org/jira/browse/HIVE-3026
> Project: Hive
>  Issue Type: New Feature
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> Details are at:
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
> Please comment

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3401) Diversify grammar for split sampling

2012-12-06 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526012#comment-13526012
 ] 

Navis commented on HIVE-3401:
-

My local git index seemed broken. I'll fix it.

> Diversify grammar for split sampling
> 
>
> Key: HIVE-3401
> URL: https://issues.apache.org/jira/browse/HIVE-3401
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
> some users wants to specify just the size of input. It can be easily 
> calculated with a few commands but it seemed good to support more grammars 
> something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-3767 started by Gang Tim Liu.

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> It seems like even after https://issues.apache.org/jira/browse/HIVE-3219 we 
> still need to set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for bucket map joins 
> (though not SMB joins)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1838 - Failure

2012-12-06 Thread Apache Jenkins Server
Changes for Build #1831
[namit] HIVE-3747 Provide hive operation name for hookContext
(Shreepadma Venugopalan via namit)


Changes for Build #1832

Changes for Build #1833
[namit] HIVE-3750 JDBCStatsPublisher fails when ID length exceeds length of ID 
column
(Kevin Wilfong via namit)


Changes for Build #1835
[namit] HIVE-3073 Hive List Bucketing - DML support
(Gang Tim Liu via namit)

[namit] HIVE-3771 HIVE-3750 broke TestParse
(Kevin Wilfong via namit)

[hashutosh] HIVE-3384 : HIVE JDBC module won't compile under JDK1.7 as new 
methods added in JDBC specification (Shengsheng Huang, Chris Drome, Mikhail 
Bautin via Ashutosh Chauhan)

[namit] HIVE-3702 Renaming table changes table location scheme/authority
(Kevin Wilfong via namit)


Changes for Build #1836
[namit] HIVE-2723 should throw "Ambiguous column reference key" Exception in 
particular
join condition (Navis via namit)

[namit] HIVE-3594 When Group by Partition Column Type is Timestamp or STRING 
Which Format contains "HH:MM:SS",
It will occur URISyntaxException (Navis via namit)


Changes for Build #1837
[namit] HIVE-3762 Minor fix for 'tableName' in Hive.g
(Navis via namit)


Changes for Build #1838



2 tests failed.
REGRESSION:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try "ant test ... 
-Dtest.silent=false" to get more logs.

Stack Trace:
junit.framework.AssertionFailedError: Unexpected exception
See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
more logs.
at junit.framework.Assert.fail(Assert.java:47)
at 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1(TestNegativeCliDriver.java:12372)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)


FAILED:  org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats19

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try "ant test ... 
-Dtest.silent=false" to get more logs.

Stack Trace:
junit.framework.AssertionFailedError: Unexpected exception
See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
more logs.
at junit.framework.Assert.fail(Assert.java:47)
at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats19(TestCliDriver.java:41208)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1838)

Status: Failure

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1838/ to 
view the results.

[jira] [Updated] (HIVE-3733) Improve Hive's logic for conditional merge

2012-12-06 Thread Pradeep Kamath (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep Kamath updated HIVE-3733:
-

Attachment: HIVE-3733.4.patch.txt

Changed the code which looks in the Operator Stack to look for 
ReduceSinkOperator instead of the exact CurrWork.getReducer() instance.

union19 no longer performs a conditional merge with this change. My hypothesis 
for this follows:

The union19 query is:
FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
UNION ALL
select s2.key as key, s2.value as value from src s2) unionsrc
INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, count(unionsrc.value) group 
by unionsrc.key
INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
unionsrc.value;

The from subquery has an implicit group by/ReduceSink due to the count. So 
though the second insert in the multi insert by itself does not have a 
groupby/ReduceSink, the subquery in the from clause causes the 
groupby/ReduceSink to appear in the stack and hence we decide not to do the 
conditional merge since the FileSink will be in the reduce.

> Improve Hive's logic for conditional merge
> --
>
> Key: HIVE-3733
> URL: https://issues.apache.org/jira/browse/HIVE-3733
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pradeep Kamath
>Assignee: Pradeep Kamath
> Attachments: HIVE-3733.1.patch.txt, HIVE-3733.3.patch.txt, 
> HIVE-3733.4.patch.txt
>
>
> If the config hive.merge.mapfiles is set to true and hive.merge.mapredfiles 
> is set to false then when hive encounters a FileSinkOperator when generating 
> map reduce tasks, it will look at the entire job to see if it has a reducer, 
> if it does it will not merge. Instead it should be check if the 
> FileSinkOperator is a child of the reducer. This means that outputs generated 
> in the mapper will be merged, and outputs generated in the reducer will not 
> be, the intended effect of setting those configs.
> Simple repro:
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=false;
> EXPLAIN
> FROM 
> INSERT OVERWRITE TABLE  SELECT key, COUNT(*) group by key
> INSERT OVERWRITE TABLE  SELECT *;
> The output should contain a Conditional Operator, Mapred Stages, and Move 
> tasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3140) Comment indenting is broken for "describe" in CLI

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526133#comment-13526133
 ] 

Namit Jain commented on HIVE-3140:
--

+1

> Comment indenting is broken for "describe" in CLI
> -
>
> Key: HIVE-3140
> URL: https://issues.apache.org/jira/browse/HIVE-3140
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Xiaoxiao Hou
>Assignee: Zhenxiao Luo
>  Labels: patch
> Fix For: 0.10.0
>
> Attachments: HIVE-3140.1.patch.txt, HIVE-3140.2.patch.txt
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Just go into the CLI and type "describe [TABLE_NAME]". If a comment has 
> multiple lines, it is completely unreadable due to poor comment indenting. 
> For example:
> birthdayParam string 1 = comment1
> 2 = comment2
> 3 = comment3
> But it supposed to display as:
> birthdayParam string 1 = comment1
>  2 = comment2
>  3 = comment3
> Comments should be indented the same amount on each line, i.e., if the 
> comment starts at row k for the first line of the comment, it should be 
> indented by k on line 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3140) Comment indenting is broken for "describe" in CLI

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3140:
-

Attachment: hive.3140.3.patch

> Comment indenting is broken for "describe" in CLI
> -
>
> Key: HIVE-3140
> URL: https://issues.apache.org/jira/browse/HIVE-3140
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Xiaoxiao Hou
>Assignee: Zhenxiao Luo
>  Labels: patch
> Fix For: 0.10.0
>
> Attachments: HIVE-3140.1.patch.txt, HIVE-3140.2.patch.txt, 
> hive.3140.3.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Just go into the CLI and type "describe [TABLE_NAME]". If a comment has 
> multiple lines, it is completely unreadable due to poor comment indenting. 
> For example:
> birthdayParam string 1 = comment1
> 2 = comment2
> 3 = comment3
> But it supposed to display as:
> birthdayParam string 1 = comment1
>  2 = comment2
>  3 = comment3
> Comments should be indented the same amount on each line, i.e., if the 
> comment starts at row k for the first line of the comment, it should be 
> indented by k on line 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526135#comment-13526135
 ] 

Gang Tim Liu commented on HIVE-3767:


https://reviews.facebook.net/D7209

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> It seems like even after https://issues.apache.org/jira/browse/HIVE-3219 we 
> still need to set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for bucket map joins 
> (though not SMB joins)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3777) add hive.stats.accurate in the partition

2012-12-06 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3777:


 Summary: add hive.stats.accurate in the partition
 Key: HIVE-3777
 URL: https://issues.apache.org/jira/browse/HIVE-3777
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain


Currently, stats task tries to update the statistics in the table/partition
being updated after the table/partition is loaded. In case of a failure to 
update these stats (due to the any reason), the operation either succeeds
(writing inaccurate stats) or fails depending on whether hive.stats.reliable
is set to true. This can be bad for applications who do not always care about
reliable stats, since the query may have taken a long time to execute and then
fail eventually.

Another option should be added: hive.accurate.stats. If hive.stats.reliable is
set to false, and stats could not be computed correctly, the operation would
still succeed, update the stats, but set hive.accurate.stats to false.
If the application cares about accurate stats, it can be obtained in the 
background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3286) Explicit skew join on user provided condition

2012-12-06 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3286:
--

Attachment: HIVE-3286.D4287.5.patch

navis updated the revision "HIVE-3286 [jira] Explicit skew join on user 
provided condition".
Reviewers: JIRA

  Addressed comments


REVISION DETAIL
  https://reviews.facebook.net/D4287

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveKey.java
  ql/src/java/org/apache/hadoop/hive/ql/io/SkewedKeyPartitioner.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBJoinTree.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/SkewContext.java
  ql/src/test/queries/clientpositive/skewjoin_explict.q
  ql/src/test/results/clientpositive/skewjoin_explict.q.out

To: JIRA, navis
Cc: njain


> Explicit skew join on user provided condition
> -
>
> Key: HIVE-3286
> URL: https://issues.apache.org/jira/browse/HIVE-3286
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3286.D4287.5.patch
>
>
> Join operation on table with skewed data takes most of execution time 
> handling the skewed keys. But mostly we already know about that and even know 
> what is look like the skewed keys.
> If we can explicitly assign reducer slots for the skewed keys, total 
> execution time could be greatly shortened.
> As for a start, I've extended join grammar something like this.
> {code}
> select * from src a join src b on a.key=b.key skew on (a.key+1 < 50, a.key+1 
> < 100, a.key < 150);
> {code}
> which means if above query is executed by 20 reducers, one reducer for 
> a.key+1 < 50, one reducer for 50 <= a.key+1 < 100, one reducer for 99 <= 
> a.key < 150, and 17 reducers for others (could be extended to assign more 
> than one reducer later)
> This can be only used with common-inner-equi joins. And skew condition should 
> be composed of join keys only.
> Work till done now will be updated shortly after code cleanup.
> 
> Skew expressions* in "SKEW ON (expr, expr, ...)" are evaluated sequentially 
> at runtime, and first 'true' one decides skew group for the row. Each skew 
> group has reserved partition slot(s), to which all rows in a group would be 
> assigned. 
> The number of partition slot reserved for each group is decided also at 
> runtime by simple calculation of percentage. If a skew group is "CLUSTER BY 
> 20 PERCENT" and total partition slot (=number of reducer) is 20, that group 
> will reserve 4 partition slots, etc.
> "DISTRIBUTE BY" decides how the rows in a group is dispersed in the range of 
> reserved slots (If there is only one slot for a group, this is meaningless). 
> Currently, three distribution policies are available: RANDOM, KEYS, 
> . 
> 1. RANDOM : rows of driver** alias are dispersed by random and rows of 
> non-driver alias are duplicated for all the slots (default if not specified)
> 2. KEYS : determined by hash value of keys (same with previous)
> 3. expression : determined by hash of object evaluated by user-provided 
> expression
> Only possible with inner, equi, common-joins. Not yet supports join tree 
> merging.
> Might be used by other RS users like "SORT BY" or "GROUP BY"
> If there exists column statistics for the key, it could be possible to apply 
> automatically.
> For example, if 20 reducers are used for the query below,
> {code}
> select count(*) from src a join src b on a.key=b.key skew on (
>a.key = '0' CLUSTER BY 10 PERCENT,
>b.key < '100' CLUSTER BY 20 PERCENT DISTRIBUTE BY upper(b.key),
>cast(a.key as int) > 300 CLUSTER BY 40 PERCENT DISTRIBUTE BY KEYS);
> {code}
> group-0 will reserve slots 6~7, group-1 8~11, group-2 12~19 and others will 
> reserve slots 0~5.
> For a row with key='0' from alias a, the row is randomly assigned in the 
> range of 6~7 (driver alias) : 6 or 7
> For a row with key='0' from alias b, the row is disributed for all slots in 
> 6~7 (non-driver alias) : 6 and 7
> For a row with key='50', the row is assigned in the range of 8~11 by hashcode 
> of upper(b.key) : 8 + (hash(upper(key)) % 4)
> For a row with key='500', the row is assigned in the range of 12~19 by 
> hashcode of join key : 12 + (hash(key) % 8)
> For a row with key='200', this is not belong to any skew group : hash(key) % 6
> *expressions in skew condition : 
> 

[jira] [Updated] (HIVE-3286) Explicit skew join on user provided condition

2012-12-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3286:


Affects Version/s: (was: 0.10.0)
   Status: Patch Available  (was: Open)

> Explicit skew join on user provided condition
> -
>
> Key: HIVE-3286
> URL: https://issues.apache.org/jira/browse/HIVE-3286
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3286.D4287.5.patch
>
>
> Join operation on table with skewed data takes most of execution time 
> handling the skewed keys. But mostly we already know about that and even know 
> what is look like the skewed keys.
> If we can explicitly assign reducer slots for the skewed keys, total 
> execution time could be greatly shortened.
> As for a start, I've extended join grammar something like this.
> {code}
> select * from src a join src b on a.key=b.key skew on (a.key+1 < 50, a.key+1 
> < 100, a.key < 150);
> {code}
> which means if above query is executed by 20 reducers, one reducer for 
> a.key+1 < 50, one reducer for 50 <= a.key+1 < 100, one reducer for 99 <= 
> a.key < 150, and 17 reducers for others (could be extended to assign more 
> than one reducer later)
> This can be only used with common-inner-equi joins. And skew condition should 
> be composed of join keys only.
> Work till done now will be updated shortly after code cleanup.
> 
> Skew expressions* in "SKEW ON (expr, expr, ...)" are evaluated sequentially 
> at runtime, and first 'true' one decides skew group for the row. Each skew 
> group has reserved partition slot(s), to which all rows in a group would be 
> assigned. 
> The number of partition slot reserved for each group is decided also at 
> runtime by simple calculation of percentage. If a skew group is "CLUSTER BY 
> 20 PERCENT" and total partition slot (=number of reducer) is 20, that group 
> will reserve 4 partition slots, etc.
> "DISTRIBUTE BY" decides how the rows in a group is dispersed in the range of 
> reserved slots (If there is only one slot for a group, this is meaningless). 
> Currently, three distribution policies are available: RANDOM, KEYS, 
> . 
> 1. RANDOM : rows of driver** alias are dispersed by random and rows of 
> non-driver alias are duplicated for all the slots (default if not specified)
> 2. KEYS : determined by hash value of keys (same with previous)
> 3. expression : determined by hash of object evaluated by user-provided 
> expression
> Only possible with inner, equi, common-joins. Not yet supports join tree 
> merging.
> Might be used by other RS users like "SORT BY" or "GROUP BY"
> If there exists column statistics for the key, it could be possible to apply 
> automatically.
> For example, if 20 reducers are used for the query below,
> {code}
> select count(*) from src a join src b on a.key=b.key skew on (
>a.key = '0' CLUSTER BY 10 PERCENT,
>b.key < '100' CLUSTER BY 20 PERCENT DISTRIBUTE BY upper(b.key),
>cast(a.key as int) > 300 CLUSTER BY 40 PERCENT DISTRIBUTE BY KEYS);
> {code}
> group-0 will reserve slots 6~7, group-1 8~11, group-2 12~19 and others will 
> reserve slots 0~5.
> For a row with key='0' from alias a, the row is randomly assigned in the 
> range of 6~7 (driver alias) : 6 or 7
> For a row with key='0' from alias b, the row is disributed for all slots in 
> 6~7 (non-driver alias) : 6 and 7
> For a row with key='50', the row is assigned in the range of 8~11 by hashcode 
> of upper(b.key) : 8 + (hash(upper(key)) % 4)
> For a row with key='500', the row is assigned in the range of 12~19 by 
> hashcode of join key : 12 + (hash(key) % 8)
> For a row with key='200', this is not belong to any skew group : hash(key) % 6
> *expressions in skew condition : 
> 1. all expressions should be made of expression in join condition, which 
> means if join condition is "a.key=b.key", user can make any expression with 
> "a.key" or "b.key". But if join condition is a.key+1=b.key, user cannot make 
> expression with "a.key" solely (should make expression with "a.key+1"). 
> 2. all expressions should reference one and only-one side of aliases. For 
> example, simple constant expressions or expressions referencing both side of 
> join condition ("a.key+b.key<100") is not allowed.
> 3. all functions in expression should be deteministic and stateless.
> 4. if "DISTRIBUTED BY expression" is used, distibution expression also should 
> have same alias with skew expression.
> **driver alias :
> 1. driver alias means the sole referenced alias from skew expression, which 
> is important for RANDOM distribution. rows of driver alias are assigned to 
> single slot randomly, but rows of non-driver alias are duplicated for all the 
> slots. So, driver alia

[jira] [Updated] (HIVE-3777) add a property in the partition to figure out if stats are accurate

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3777:
-

Description: 
Currently, stats task tries to update the statistics in the table/partition
being updated after the table/partition is loaded. In case of a failure to 
update these stats (due to the any reason), the operation either succeeds
(writing inaccurate stats) or fails depending on whether hive.stats.reliable
is set to true. This can be bad for applications who do not always care about
reliable stats, since the query may have taken a long time to execute and then
fail eventually.

Another property should be added to the partition: areStatsAccurate. If 
hive.stats.reliable is
set to false, and stats could not be computed correctly, the operation would
still succeed, update the stats, but set areStatsAccurate to false.
If the application cares about accurate stats, it can be obtained in the 
background.

  was:
Currently, stats task tries to update the statistics in the table/partition
being updated after the table/partition is loaded. In case of a failure to 
update these stats (due to the any reason), the operation either succeeds
(writing inaccurate stats) or fails depending on whether hive.stats.reliable
is set to true. This can be bad for applications who do not always care about
reliable stats, since the query may have taken a long time to execute and then
fail eventually.

Another option should be added: hive.accurate.stats. If hive.stats.reliable is
set to false, and stats could not be computed correctly, the operation would
still succeed, update the stats, but set hive.accurate.stats to false.
If the application cares about accurate stats, it can be obtained in the 
background.

Summary: add a property in the partition to figure out if stats are 
accurate  (was: add hive.stats.accurate in the partition)

> add a property in the partition to figure out if stats are accurate
> ---
>
> Key: HIVE-3777
> URL: https://issues.apache.org/jira/browse/HIVE-3777
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>
> Currently, stats task tries to update the statistics in the table/partition
> being updated after the table/partition is loaded. In case of a failure to 
> update these stats (due to the any reason), the operation either succeeds
> (writing inaccurate stats) or fails depending on whether hive.stats.reliable
> is set to true. This can be bad for applications who do not always care about
> reliable stats, since the query may have taken a long time to execute and then
> fail eventually.
> Another property should be added to the partition: areStatsAccurate. If 
> hive.stats.reliable is
> set to false, and stats could not be computed correctly, the operation would
> still succeed, update the stats, but set areStatsAccurate to false.
> If the application cares about accurate stats, it can be obtained in the 
> background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3401) Diversify grammar for split sampling

2012-12-06 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3401:
--

Attachment: HIVE-3401.D4821.4.patch

navis updated the revision "HIVE-3401 [jira] Diversify grammar for split 
sampling".
Reviewers: JIRA

  Fixing git index


REVISION DETAIL
  https://reviews.facebook.net/D4821

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SplitSample.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java
  ql/src/test/queries/clientpositive/split_sample.q
  ql/src/test/results/clientpositive/input4.q.out
  ql/src/test/results/clientpositive/nonmr_fetch.q.out
  ql/src/test/results/clientpositive/plan_json.q.out
  ql/src/test/results/clientpositive/split_sample.q.out

To: JIRA, navis


> Diversify grammar for split sampling
> 
>
> Key: HIVE-3401
> URL: https://issues.apache.org/jira/browse/HIVE-3401
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch, 
> HIVE-3401.D4821.4.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
> some users wants to specify just the size of input. It can be easily 
> calculated with a few commands but it seemed good to support more grammars 
> something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3777) add a property in the partition to figure out if stats are accurate

2012-12-06 Thread Sambavi Muthukrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526143#comment-13526143
 ] 

Sambavi Muthukrishnan commented on HIVE-3777:
-

I agree with the idea of adding this Partition level property so we can ensure 
stats are kept up-to-date. Seems like we should set it to be false when we 
update the metadata for the partition at time of overwrite/insert into 
(transactionally). Since we just overwrite the files, I suppose we cant 
guarantee that the property reflects the correct value unless we conservatively 
set it to false before-hand, and then update it to true whenever the stats are 
computed next. Comments?

> add a property in the partition to figure out if stats are accurate
> ---
>
> Key: HIVE-3777
> URL: https://issues.apache.org/jira/browse/HIVE-3777
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>
> Currently, stats task tries to update the statistics in the table/partition
> being updated after the table/partition is loaded. In case of a failure to 
> update these stats (due to the any reason), the operation either succeeds
> (writing inaccurate stats) or fails depending on whether hive.stats.reliable
> is set to true. This can be bad for applications who do not always care about
> reliable stats, since the query may have taken a long time to execute and then
> fail eventually.
> Another property should be added to the partition: areStatsAccurate. If 
> hive.stats.reliable is
> set to false, and stats could not be computed correctly, the operation would
> still succeed, update the stats, but set areStatsAccurate to false.
> If the application cares about accurate stats, it can be obtained in the 
> background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3767:
---

Attachment: HIVE-3767.patch.2

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
> Attachments: HIVE-3767.patch.2
>
>
> It seems like even after https://issues.apache.org/jira/browse/HIVE-3219 we 
> still need to set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for bucket map joins 
> (though not SMB joins)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3767:
---

Status: Patch Available  (was: In Progress)

patch is available.

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
> Attachments: HIVE-3767.patch.2
>
>
> It seems like even after https://issues.apache.org/jira/browse/HIVE-3219 we 
> still need to set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for bucket map joins 
> (though not SMB joins)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3401) Diversify grammar for split sampling

2012-12-06 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3401:
--

Attachment: HIVE-3401.D4821.5.patch

navis updated the revision "HIVE-3401 [jira] Diversify grammar for split 
sampling".
Reviewers: JIRA

  Fixing..


REVISION DETAIL
  https://reviews.facebook.net/D4821

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SplitSample.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java
  ql/src/test/queries/clientpositive/split_sample.q
  ql/src/test/results/clientpositive/input4.q.out
  ql/src/test/results/clientpositive/nonmr_fetch.q.out
  ql/src/test/results/clientpositive/plan_json.q.out
  ql/src/test/results/clientpositive/split_sample.q.out

To: JIRA, navis


> Diversify grammar for split sampling
> 
>
> Key: HIVE-3401
> URL: https://issues.apache.org/jira/browse/HIVE-3401
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch, 
> HIVE-3401.D4821.4.patch, HIVE-3401.D4821.5.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
> some users wants to specify just the size of input. It can be easily 
> calculated with a few commands but it seemed good to support more grammars 
> something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3401) Diversify grammar for split sampling

2012-12-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3401:


Status: Patch Available  (was: Open)

> Diversify grammar for split sampling
> 
>
> Key: HIVE-3401
> URL: https://issues.apache.org/jira/browse/HIVE-3401
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch, 
> HIVE-3401.D4821.4.patch, HIVE-3401.D4821.5.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
> some users wants to specify just the size of input. It can be easily 
> calculated with a few commands but it seemed good to support more grammars 
> something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3400) Add Retries to Hive MetaStore Connections

2012-12-06 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3400:
---

   Resolution: Fixed
Fix Version/s: 0.10.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and 0.10. Thanks, Bhushan!

> Add Retries to Hive MetaStore Connections
> -
>
> Key: HIVE-3400
> URL: https://issues.apache.org/jira/browse/HIVE-3400
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Bhushan Mandhani
>Assignee: Bhushan Mandhani
>Priority: Minor
>  Labels: metastore
> Fix For: 0.10.0
>
> Attachments: HIVE-3400.1.patch.txt, HIVE-3400.2.patch.txt, 
> HIVE-3400.3.patch.txt
>
>
> Currently, when using Thrift to access the MetaStore, if the Thrift host 
> dies, there is no mechanism to reconnect to some other host even if the 
> MetaStore URIs variable in the Conf contains multiple hosts. Hive should 
> retry and reconnect rather than throwing a communication link error.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3733) Improve Hive's logic for conditional merge

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526155#comment-13526155
 ] 

Namit Jain commented on HIVE-3733:
--

[~pkamath], can you load the complete patch ?

Ideally, the sub-query should not matter. By the time you are looking at the 
FileSinkDesc for insert, the union should have been processed.
You should not look at the complete stack, but the first operator which would 
break the tree, which is union in this case.

Thinking more about it, this approach is fairly difficult to get right. The 
more I think about it, the more I like the earlier idea, of moving
the merge to a physical optimizer. The tasks have already been broken up, and 
the stack would be well defined. We dont have to hack this up.
[~kevinwilfong], what do you think ? 

> Improve Hive's logic for conditional merge
> --
>
> Key: HIVE-3733
> URL: https://issues.apache.org/jira/browse/HIVE-3733
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pradeep Kamath
>Assignee: Pradeep Kamath
> Attachments: HIVE-3733.1.patch.txt, HIVE-3733.3.patch.txt, 
> HIVE-3733.4.patch.txt
>
>
> If the config hive.merge.mapfiles is set to true and hive.merge.mapredfiles 
> is set to false then when hive encounters a FileSinkOperator when generating 
> map reduce tasks, it will look at the entire job to see if it has a reducer, 
> if it does it will not merge. Instead it should be check if the 
> FileSinkOperator is a child of the reducer. This means that outputs generated 
> in the mapper will be merged, and outputs generated in the reducer will not 
> be, the intended effect of setting those configs.
> Simple repro:
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=false;
> EXPLAIN
> FROM 
> INSERT OVERWRITE TABLE  SELECT key, COUNT(*) group by key
> INSERT OVERWRITE TABLE  SELECT *;
> The output should contain a Conditional Operator, Mapred Stages, and Move 
> tasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3777) add a property in the partition to figure out if stats are accurate

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526157#comment-13526157
 ] 

Namit Jain commented on HIVE-3777:
--

Agreed

> add a property in the partition to figure out if stats are accurate
> ---
>
> Key: HIVE-3777
> URL: https://issues.apache.org/jira/browse/HIVE-3777
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>
> Currently, stats task tries to update the statistics in the table/partition
> being updated after the table/partition is loaded. In case of a failure to 
> update these stats (due to the any reason), the operation either succeeds
> (writing inaccurate stats) or fails depending on whether hive.stats.reliable
> is set to true. This can be bad for applications who do not always care about
> reliable stats, since the query may have taken a long time to execute and then
> fail eventually.
> Another property should be added to the partition: areStatsAccurate. If 
> hive.stats.reliable is
> set to false, and stats could not be computed correctly, the operation would
> still succeed, update the stats, but set areStatsAccurate to false.
> If the application cares about accurate stats, it can be obtained in the 
> background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526156#comment-13526156
 ] 

Namit Jain commented on HIVE-3767:
--

+1

looks good, can you file a follow-up for adding MapJoinDesc.isBucketMapJoin() 
as part of explain plan ?

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
> Attachments: HIVE-3767.patch.2
>
>
> It seems like even after https://issues.apache.org/jira/browse/HIVE-3219 we 
> still need to set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for bucket map joins 
> (though not SMB joins)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3733) Improve Hive's logic for conditional merge

2012-12-06 Thread Pradeep Kamath (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526161#comment-13526161
 ] 

Pradeep Kamath commented on HIVE-3733:
--

Hi Namit - HIVE-3733.4.patch.txt has the most recent full patch.

> Improve Hive's logic for conditional merge
> --
>
> Key: HIVE-3733
> URL: https://issues.apache.org/jira/browse/HIVE-3733
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pradeep Kamath
>Assignee: Pradeep Kamath
> Attachments: HIVE-3733.1.patch.txt, HIVE-3733.3.patch.txt, 
> HIVE-3733.4.patch.txt
>
>
> If the config hive.merge.mapfiles is set to true and hive.merge.mapredfiles 
> is set to false then when hive encounters a FileSinkOperator when generating 
> map reduce tasks, it will look at the entire job to see if it has a reducer, 
> if it does it will not merge. Instead it should be check if the 
> FileSinkOperator is a child of the reducer. This means that outputs generated 
> in the mapper will be merged, and outputs generated in the reducer will not 
> be, the intended effect of setting those configs.
> Simple repro:
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=false;
> EXPLAIN
> FROM 
> INSERT OVERWRITE TABLE  SELECT key, COUNT(*) group by key
> INSERT OVERWRITE TABLE  SELECT *;
> The output should contain a Conditional Operator, Mapred Stages, and Move 
> tasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3778) Add MapJoinDesc.isBucketMapJoin() as part of explain plan

2012-12-06 Thread Gang Tim Liu (JIRA)
Gang Tim Liu created HIVE-3778:
--

 Summary: Add MapJoinDesc.isBucketMapJoin() as part of explain plan
 Key: HIVE-3778
 URL: https://issues.apache.org/jira/browse/HIVE-3778
 Project: Hive
  Issue Type: Bug
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu
Priority: Minor


This is follow up of HIVE-3767:

Add MapJoinDesc.isBucketMapJoin() as part of explain plan

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3767) BucketizedHiveInputFormat should be automatically used with Bucketized Map Joins also

2012-12-06 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526163#comment-13526163
 ] 

Gang Tim Liu commented on HIVE-3767:


thanks a lot

Yes, filed HIVE-3778.

thanks a lot

> BucketizedHiveInputFormat should be automatically used with Bucketized Map 
> Joins also
> -
>
> Key: HIVE-3767
> URL: https://issues.apache.org/jira/browse/HIVE-3767
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
> Attachments: HIVE-3767.patch.2
>
>
> It seems like even after https://issues.apache.org/jira/browse/HIVE-3219 we 
> still need to set hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat for bucket map joins 
> (though not SMB joins)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3401) Diversify grammar for split sampling

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3401:
-

Status: Open  (was: Patch Available)

comments on phabricator

> Diversify grammar for split sampling
> 
>
> Key: HIVE-3401
> URL: https://issues.apache.org/jira/browse/HIVE-3401
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch, 
> HIVE-3401.D4821.4.patch, HIVE-3401.D4821.5.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
> some users wants to specify just the size of input. It can be easily 
> calculated with a few commands but it seemed good to support more grammars 
> something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3401) Diversify grammar for split sampling

2012-12-06 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526166#comment-13526166
 ] 

Phabricator commented on HIVE-3401:
---

njain has commented on the revision "HIVE-3401 [jira] Diversify grammar for 
split sampling".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g:1813 This is 
super-confusing.

  Can you different tokens instead of TRUE and FALSE to differentiate between 
%, rows etc.
  ql/src/java/org/apache/hadoop/hive/ql/parse/SplitSample.java:40 This comment 
is no longer valid.

  Since only one of them is valid, do you want to create a sub-class (to 
simulate unions).
  It is pretty minor thing, so even leaving as is is fine just add more comments
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java:483 Hide 
splitSample.getLength(), getPercent() in a public method in SplitSample -
  CHIF need not know these details.
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java:443 same as for 
CHIF

REVISION DETAIL
  https://reviews.facebook.net/D4821

To: JIRA, navis
Cc: njain


> Diversify grammar for split sampling
> 
>
> Key: HIVE-3401
> URL: https://issues.apache.org/jira/browse/HIVE-3401
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch, 
> HIVE-3401.D4821.4.patch, HIVE-3401.D4821.5.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
> some users wants to specify just the size of input. It can be easily 
> calculated with a few commands but it seemed good to support more grammars 
> something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3733) Improve Hive's logic for conditional merge

2012-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526168#comment-13526168
 ] 

Namit Jain commented on HIVE-3733:
--

Thanks, what do you think of moving this logic once the tasks have been 
finalized - PhysicalOptimizer.java

> Improve Hive's logic for conditional merge
> --
>
> Key: HIVE-3733
> URL: https://issues.apache.org/jira/browse/HIVE-3733
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pradeep Kamath
>Assignee: Pradeep Kamath
> Attachments: HIVE-3733.1.patch.txt, HIVE-3733.3.patch.txt, 
> HIVE-3733.4.patch.txt
>
>
> If the config hive.merge.mapfiles is set to true and hive.merge.mapredfiles 
> is set to false then when hive encounters a FileSinkOperator when generating 
> map reduce tasks, it will look at the entire job to see if it has a reducer, 
> if it does it will not merge. Instead it should be check if the 
> FileSinkOperator is a child of the reducer. This means that outputs generated 
> in the mapper will be merged, and outputs generated in the reducer will not 
> be, the intended effect of setting those configs.
> Simple repro:
> set hive.merge.mapfiles=true;
> set hive.merge.mapredfiles=false;
> EXPLAIN
> FROM 
> INSERT OVERWRITE TABLE  SELECT key, COUNT(*) group by key
> INSERT OVERWRITE TABLE  SELECT *;
> The output should contain a Conditional Operator, Mapred Stages, and Move 
> tasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2991) Integrate Clover with Hive

2012-12-06 Thread Ilya Katsov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526169#comment-13526169
 ] 

Ilya Katsov commented on HIVE-2991:
---

Apparently it makes sense to exclude auto generated and sample code from the 
coverage reports to obtain realistic total coverage percentage. Updated patches 
has been attached. 

> Integrate Clover with Hive
> --
>
> Key: HIVE-2991
> URL: https://issues.apache.org/jira/browse/HIVE-2991
> Project: Hive
>  Issue Type: Test
>  Components: Testing Infrastructure
>Affects Versions: 0.9.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, 
> hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
> hive.2991.1.trunk.patch
>
>
> Atlassian has donated license of their code coverage tool Clover to ASF. Lets 
> make use of it to generate code coverage report to figure out which areas of 
> Hive are well tested and which ones are not. More information about license 
> can be found in Hadoop jira HADOOP-1718 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3140) Comment indenting is broken for "describe" in CLI

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3140:
-

Status: Open  (was: Patch Available)

I am getting ~67 test failures. Most of them seem like test updates.
Can you update ?

> Comment indenting is broken for "describe" in CLI
> -
>
> Key: HIVE-3140
> URL: https://issues.apache.org/jira/browse/HIVE-3140
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Xiaoxiao Hou
>Assignee: Zhenxiao Luo
>  Labels: patch
> Fix For: 0.10.0
>
> Attachments: HIVE-3140.1.patch.txt, HIVE-3140.2.patch.txt, 
> hive.3140.3.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Just go into the CLI and type "describe [TABLE_NAME]". If a comment has 
> multiple lines, it is completely unreadable due to poor comment indenting. 
> For example:
> birthdayParam string 1 = comment1
> 2 = comment2
> 3 = comment3
> But it supposed to display as:
> birthdayParam string 1 = comment1
>  2 = comment2
>  3 = comment3
> Comments should be indented the same amount on each line, i.e., if the 
> comment starts at row k for the first line of the comment, it should be 
> indented by k on line 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3728) make optimizing multi-group by configurable

2012-12-06 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3728:
-

Attachment: hive.3728.3.patch

> make optimizing multi-group by configurable
> ---
>
> Key: HIVE-3728
> URL: https://issues.apache.org/jira/browse/HIVE-3728
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3728.2.patch, hive.3728.3.patch
>
>
> This was done as part of https://issues.apache.org/jira/browse/HIVE-609.
> This should be configurable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3777) add a property in the partition to figure out if stats are accurate

2012-12-06 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu reassigned HIVE-3777:
--

Assignee: Gang Tim Liu

> add a property in the partition to figure out if stats are accurate
> ---
>
> Key: HIVE-3777
> URL: https://issues.apache.org/jira/browse/HIVE-3777
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Gang Tim Liu
>
> Currently, stats task tries to update the statistics in the table/partition
> being updated after the table/partition is loaded. In case of a failure to 
> update these stats (due to the any reason), the operation either succeeds
> (writing inaccurate stats) or fails depending on whether hive.stats.reliable
> is set to true. This can be bad for applications who do not always care about
> reliable stats, since the query may have taken a long time to execute and then
> fail eventually.
> Another property should be added to the partition: areStatsAccurate. If 
> hive.stats.reliable is
> set to false, and stats could not be computed correctly, the operation would
> still succeed, update the stats, but set areStatsAccurate to false.
> If the application cares about accurate stats, it can be obtained in the 
> background.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2935) Implement HiveServer2

2012-12-06 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-2935:
--

Attachment: HS2-with-thrift-patch-rebased.patch
HS2-changed-files-only.patch

> Implement HiveServer2
> -
>
> Key: HIVE-2935
> URL: https://issues.apache.org/jira/browse/HIVE-2935
> Project: Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Carl Steinbach
>Assignee: Carl Steinbach
>  Labels: HiveServer2
> Attachments: beelinepositive.tar.gz, HIVE-2935.1.notest.patch.txt, 
> HIVE-2935.2.notest.patch.txt, HIVE-2935.2.nothrift.patch.txt, 
> HS2-changed-files-only.patch, HS2-with-thrift-patch-rebased.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2477) Use name of original expression for name of CAST output

2012-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526185#comment-13526185
 ] 

Hudson commented on HIVE-2477:
--

Integrated in Hive-trunk-h0.21 #1839 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1839/])
HIVE-2477 Use name of original expression for name of CAST output
(Navis via namit) (Revision 1418012)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1418012
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
* /hive/trunk/ql/src/test/queries/clientpositive/alias_casted_column.q
* /hive/trunk/ql/src/test/results/clientpositive/alias_casted_column.q.out


> Use name of original expression for name of CAST output
> ---
>
> Key: HIVE-2477
> URL: https://issues.apache.org/jira/browse/HIVE-2477
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adam Kramer
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-2477.1.patch.txt, hive.2477.4.patch, 
> HIVE-2477.D7161.1.patch, HIVE-2477.D7161.2.patch
>
>
> CAST(foo AS INT)
> should, by default, consider itself a column named foo if 
> unspecified/unaliased.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1839 - Still Failing

2012-12-06 Thread Apache Jenkins Server
Changes for Build #1831
[namit] HIVE-3747 Provide hive operation name for hookContext
(Shreepadma Venugopalan via namit)


Changes for Build #1832

Changes for Build #1833
[namit] HIVE-3750 JDBCStatsPublisher fails when ID length exceeds length of ID 
column
(Kevin Wilfong via namit)


Changes for Build #1835
[namit] HIVE-3073 Hive List Bucketing - DML support
(Gang Tim Liu via namit)

[namit] HIVE-3771 HIVE-3750 broke TestParse
(Kevin Wilfong via namit)

[hashutosh] HIVE-3384 : HIVE JDBC module won't compile under JDK1.7 as new 
methods added in JDBC specification (Shengsheng Huang, Chris Drome, Mikhail 
Bautin via Ashutosh Chauhan)

[namit] HIVE-3702 Renaming table changes table location scheme/authority
(Kevin Wilfong via namit)


Changes for Build #1836
[namit] HIVE-2723 should throw "Ambiguous column reference key" Exception in 
particular
join condition (Navis via namit)

[namit] HIVE-3594 When Group by Partition Column Type is Timestamp or STRING 
Which Format contains "HH:MM:SS",
It will occur URISyntaxException (Navis via namit)


Changes for Build #1837
[namit] HIVE-3762 Minor fix for 'tableName' in Hive.g
(Navis via namit)


Changes for Build #1838

Changes for Build #1839
[namit] HIVE-2477 Use name of original expression for name of CAST output
(Navis via namit)




1 tests failed.
FAILED:  org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats19

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try "ant test ... 
-Dtest.silent=false" to get more logs.

Stack Trace:
junit.framework.AssertionFailedError: Unexpected exception
See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
more logs.
at junit.framework.Assert.fail(Assert.java:47)
at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats19(TestCliDriver.java:41260)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1839)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1839/ to 
view the results.

[jira] [Updated] (HIVE-3401) Diversify grammar for split sampling

2012-12-06 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3401:
--

Attachment: HIVE-3401.D4821.6.patch

navis updated the revision "HIVE-3401 [jira] Diversify grammar for split 
sampling".
Reviewers: JIRA

  Addressed comments


REVISION DETAIL
  https://reviews.facebook.net/D4821

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SplitSample.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java
  ql/src/test/queries/clientpositive/split_sample.q
  ql/src/test/results/clientpositive/input4.q.out
  ql/src/test/results/clientpositive/nonmr_fetch.q.out
  ql/src/test/results/clientpositive/plan_json.q.out
  ql/src/test/results/clientpositive/split_sample.q.out

To: JIRA, navis
Cc: njain


> Diversify grammar for split sampling
> 
>
> Key: HIVE-3401
> URL: https://issues.apache.org/jira/browse/HIVE-3401
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3401.D4821.2.patch, HIVE-3401.D4821.3.patch, 
> HIVE-3401.D4821.4.patch, HIVE-3401.D4821.5.patch, HIVE-3401.D4821.6.patch
>
>
> Current split sampling only supports grammar like TABLESAMPLE(n PERCENT). But 
> some users wants to specify just the size of input. It can be easily 
> calculated with a few commands but it seemed good to support more grammars 
> something like TABLESAMPLE(500M). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3140) Comment indenting is broken for "describe" in CLI

2012-12-06 Thread Zhenxiao Luo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526199#comment-13526199
 ] 

Zhenxiao Luo commented on HIVE-3140:


Sure. Sorry, I did not run the whole testcases.


> Comment indenting is broken for "describe" in CLI
> -
>
> Key: HIVE-3140
> URL: https://issues.apache.org/jira/browse/HIVE-3140
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Xiaoxiao Hou
>Assignee: Zhenxiao Luo
>  Labels: patch
> Fix For: 0.10.0
>
> Attachments: HIVE-3140.1.patch.txt, HIVE-3140.2.patch.txt, 
> hive.3140.3.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Just go into the CLI and type "describe [TABLE_NAME]". If a comment has 
> multiple lines, it is completely unreadable due to poor comment indenting. 
> For example:
> birthdayParam string 1 = comment1
> 2 = comment2
> 3 = comment3
> But it supposed to display as:
> birthdayParam string 1 = comment1
>  2 = comment2
>  3 = comment3
> Comments should be indented the same amount on each line, i.e., if the 
> comment starts at row k for the first line of the comment, it should be 
> indented by k on line 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira