[jira] [Created] (HIVE-15316) CTAS STORED AS AVRO: AvroTypeException Found default.record_0, expecting union

2016-11-30 Thread David Maughan (JIRA)
David Maughan created HIVE-15316:


 Summary: CTAS STORED AS AVRO: AvroTypeException Found 
default.record_0, expecting union
 Key: HIVE-15316
 URL: https://issues.apache.org/jira/browse/HIVE-15316
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.0
Reporter: David Maughan
Priority: Minor


There's an issue when querying a table that has been created as Avro via CTAS 
when the target struct is at least 2 struct-levels deep. It can be replicated 
with the following steps:

{code}
CREATE TABLE a
  STORED AS AVRO
  AS
SELECT named_struct('c', named_struct('d', 1)) as b;

SELECT b FROM a;

org.apache.avro.AvroTypeException: Found default.record_0, expecting union
{code}

The reason for this is that during table creation, the Avro schema is generated 
from the Hive columns in {{AvroSerDe}} and then passed through the Avro Schema 
Parser: {{new Schema.Parser().parse(schema.toString())}}. For the above 
example, this creates the below schema in the Avro file. Note that the lowest 
level struct, {{record_0}} has {{"namespace": "default"}}.

{code}
{
  "type": "record",
  "name": "a",
  "namespace": "default",
  "fields": [
{
  "name": "b",
  "type": [
"null",
{
  "type": "record",
  "name": "record_1",
  "namespace": "",
  "doc": "struct>",
  "fields": [
{
  "name": "c",
  "type": [
"null",
{
  "type": "record",
  "name": "record_0",
  "namespace": "default",
  "doc": "struct",
  "fields": [
{
  "name": "d",
  "type": [ "null", "int" ],
  "doc": "int",
  "default": null
}
  ]
}
  ],
  "doc": "struct",
  "default": null
}
  ]
}
  ],
  "default": null
}
  ]
}
{code}

On a subsequent select query, the Avro schema is again generated from the Hive 
columns. However, this time it is not passed through the Avro Schema Parser and 
the {{namespace}} attribute is not present in {{record_0}}. The actual Error 
message _"Found default.record_0, expecting union"_ is slightly misleading. 
Although it is a expected a union, it is specifically expected a null or a 
record named {{record_0}} but it finds {{default.record_0}}.

I believe this is a bug in Avro. I'm not sure whether correct behaviour is to 
cascade the namespace down or not but it is definitely an inconsistency between 
creating a schema via the builders and parser. I've created 
[AVRO-1965|https://issues.apache.org/jira/browse/AVRO-1965] for this. However, 
I believe that defensively passing the schema through the Avro Schema Parser on 
a select query would fix this issue in Hive without an Avro fix and version 
bump in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15318) Query "insert into table values()" creates the tmp table under the current database

2016-11-30 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15318:
---

 Summary: Query "insert into table values()" creates the tmp table 
under the current database
 Key: HIVE-15318
 URL: https://issues.apache.org/jira/browse/HIVE-15318
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The current implementation of "insert into db1.table1 values()" creates a tmp 
table under the current database while table1 may not be under current 
database. 

e.g.,

{noformat}
use default;
create database db1;
create table db1.table1(x int);
insert into db1.table1 values(3);
{noformat}

It will create the tmp table under default database. Now if authorization is 
turned on and the current user only has access to db1 but not default database, 
then it will cause access issue.

We may need to rethink the approach for the implementation. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15317) Query "insert into table values()" creates the tmp table under the current database

2016-11-30 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15317:
---

 Summary: Query "insert into table values()" creates the tmp table 
under the current database
 Key: HIVE-15317
 URL: https://issues.apache.org/jira/browse/HIVE-15317
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 2.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


The current implementation of "insert into db1.table1 values()" creates a tmp 
table under the current database while table1 may not be under current 
database. 

e.g.,

{noformat}
use default;
create database db1;
create table db1.table1(x int);
insert into db1.table1 values(3);
{noformat}

It will create the tmp table under default database. Now if authorization is 
turned on and the current user only has access to db1 but not default database, 
then it will cause access issue.

We may need to rethink the approach for the implementation. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15319) Beeline is not validating Kerberos Realm

2016-11-30 Thread Matyas Orhidi (JIRA)
Matyas Orhidi created HIVE-15319:


 Summary: Beeline is not validating Kerberos Realm
 Key: HIVE-15319
 URL: https://issues.apache.org/jira/browse/HIVE-15319
 Project: Hive
  Issue Type: Bug
Reporter: Matyas Orhidi


Having "hive.server2.authentication.kerberos.principal" property set as 
"hive/somehost@SOME.REALM" [1] in HS2 
- When connecting to the service using beeline, seemingly the realm part of the 
service principal in the JDBC URL is not validated 
- You can connect to HS2 using any realm e.g. 
principal=hive/somehost@ANYOTHER.REALM [2] 

*** 
[1]  
hive.server2.authentication.kerberos.principal 
hive/somehost@SOME.REALM 
 

[2] 'jdbc:hive2://somehost:1/default;principal=hive/somehost@ANYOTHER.REALM'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15320) Cross Realm hive query is failing with KERBEROS authentication error

2016-11-30 Thread Yongzhi Chen (JIRA)
Yongzhi Chen created HIVE-15320:
---

 Summary: Cross Realm hive query is failing with KERBEROS 
authentication error
 Key: HIVE-15320
 URL: https://issues.apache.org/jira/browse/HIVE-15320
 Project: Hive
  Issue Type: Improvement
  Components: Security
Reporter: Yongzhi Chen


Executing cross realm query and it is failing.
Authentication against remote NN is tried with SIMPLE, not KERBEROS.
It looks Hive does not obtain needed ticket for remote NN.

insert overwrite directory 'hdfs://differentrealmhost:8020/hive/test' select * 
from currentrealmtable where ...;
It will fail with
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client 
cannot authenticate via:[TOKEN, KERBEROS]

hdfs command distcp works fine. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 53983: HIVE-14582 : Add trunc(numeric) udf

2016-11-30 Thread chinnarao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53983/
---

(Updated Nov. 30, 2016, 7:04 p.m.)


Review request for hive and Ashutosh Chauhan.


Changes
---

Updated diff like, now trun() function will accept non constants as scale 
argument. Updated test cases also. All tests are pass in local environment.


Repository: hive-git


Description
---

Overload trunc() function to accept numbers.

Now trunc() will accept date or number type arguments and it will behave as 
below

trunc(date, fmt) / trunc(N,D) - Returns

If input is date returns date with the time portion of the day truncated to the 
unit specified by the format model fmt. 
If you omit fmt, then date is truncated to "the nearest day. It now only 
supports 'MONTH'/'MON'/'MM' and 'YEAR'/''/'YY' as format.

If input is a number group returns N truncated to D decimal places. If D is 
omitted, then N is truncated to 0 places.
D can be negative to truncate (make zero) D digits left of the decimal point.


Diffs (updated)
-

  data/files/trunc_number.txt PRE-CREATION 
  data/files/trunc_number1.txt PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrunc.java 
e20ad65 
  ql/src/test/queries/clientnegative/udf_trunc_error3.q PRE-CREATION 
  ql/src/test/queries/clientpositive/udf_trunc_number.q PRE-CREATION 
  ql/src/test/results/clientnegative/udf_trunc_error1.q.out 5d65b11 
  ql/src/test/results/clientnegative/udf_trunc_error2.q.out 55a2185 
  ql/src/test/results/clientnegative/udf_trunc_error3.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/udf_trunc.q.out 4c9f76d 
  ql/src/test/results/clientpositive/udf_trunc_number.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/53983/diff/


Testing
---

All tests are pass.


Thanks,

chinna



[jira] [Created] (HIVE-15321) Change to read as long for HiveConf.ConfVars.METASTORESERVERMAXMESSAGESIZE

2016-11-30 Thread Aihua Xu (JIRA)
Aihua Xu created HIVE-15321:
---

 Summary: Change to read as long for 
HiveConf.ConfVars.METASTORESERVERMAXMESSAGESIZE
 Key: HIVE-15321
 URL: https://issues.apache.org/jira/browse/HIVE-15321
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 1.1.0, 1.2.0
Reporter: Aihua Xu
Assignee: Aihua Xu


Follow up on HIVE-11240 which tries to change the type from int to long, while 
we are still read with {{conf.getIntVar()}}. 

Seems we should use {{conf.getLongVar()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Apache Hive 2.1.1 Release Candidate 1

2016-11-30 Thread Sergio Pena
Jesus,

I tried verifying the md5 and gpg signatures, but I get these errors:

hive/packaging/target⟫ md5sum -c apache-hive-2.1.1-bin.tar.gz.md5
apache-hive-2.1.1-bin.tar.gz: FAILED
md5sum: WARNING: 1 computed checksum did NOT match

hive/packaging/target⟫ gpg --verify apache-hive-2.1.1-bin.tar.gz.asc
apache-hive-2.1.1-bin.tar.gz
gpg: Signature made Tue 29 Nov 2016 01:57:04 PM CST
gpg:using RSA key 931E4AB3C516B444
gpg: Can't check signature: No public key

I'm using ubuntu, so I think the md5 differs from OSX and Linux machines. I
remember seeing this problem before. What OS did you use?

for the GPG keys, I imported the KEYS file mentioned in the Wiki, but I
still get that error. Any idea what I'm missing?

On Tue, Nov 29, 2016 at 6:23 PM, Gary Gregory 
wrote:

> FWIW, running 'mvn clean install' has been failing on Git master for a long
> time on Windows. Will that ever be fixed?
>
> Gary
>
> On Tue, Nov 29, 2016 at 12:17 PM, Jesus Camacho Rodriguez <
> jcama...@apache.org> wrote:
>
> > Apache Hive 2.1.1 Release Candidate 1 is available here:
> > http://people.apache.org/~jcamacho/hive-2.1.1-rc1/
> >
> > Maven artifacts are available here:
> > https://repository.apache.org/content/repositories/orgapachehive-1066/
> >
> > Source tag for RC1 is at:
> > https://github.com/apache/hive/releases/tag/release-2.1.1-rc1/
> >
> > Voting will conclude in 72 hours.
> >
> > Hive PMC Members: Please test and vote.
> >
> > Thanks.
> >
> >
> >
> >
>
>
> --
> E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
> Java Persistence with Hibernate, Second Edition
>  tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1617290459&
> linkCode=as2&tag=garygregory-20&linkId=cadb800f39946ec62ea2b1af9fe6a2b8>
>
>  1617290459>
> JUnit in Action, Second Edition
>  tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182021&
> linkCode=as2&tag=garygregory-20&linkId=31ecd1f6b6d1eaf8886ac902a24de418%22
> >
>
>  1935182021>
> Spring Batch in Action
>  tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1935182951&
> linkCode=%7B%7BlinkCode%7D%7D&tag=garygregory-20&linkId=%7B%
> 7Blink_id%7D%7D%22%3ESpring+Batch+in+Action>
>  1935182951>
> Blog: http://garygregory.wordpress.com
> Home: http://garygregory.com/
> Tweet! http://twitter.com/GaryGregory
>


Review Request 54236: HIVE-15296 AM may lose task failures and not reschedule when scheduling to LLAP

2016-11-30 Thread Sergey Shelukhin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54236/
---

Review request for hive, Prasanth_J and Siddharth Seth.


Repository: hive-git


Description
---

see jira


Diffs
-

  
llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java
 4933fb3 
  
llap-common/src/gen/protobuf/gen-java/org/apache/hadoop/hive/llap/daemon/rpc/LlapDaemonProtocolProtos.java
 0581681 
  llap-common/src/java/org/apache/hadoop/hive/llap/DaemonId.java ea47330 
  
llap-common/src/java/org/apache/hadoop/hive/llap/protocol/LlapTaskUmbilicalProtocol.java
 9549567 
  llap-common/src/protobuf/LlapDaemonProtocol.proto 2e74c18 
  llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/AMReporter.java 
04c28cb 
  
llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/ContainerRunnerImpl.java
 91a321d 
  llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/LlapDaemon.java 
752e6ee 
  
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskCommunicator.java
 0deebf9 

Diff: https://reviews.apache.org/r/54236/diff/


Testing
---


Thanks,

Sergey Shelukhin



Re: Review Request 53983: HIVE-14582 : Add trunc(numeric) udf

2016-11-30 Thread Vineet Garg

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53983/#review157471
---




ql/src/test/queries/clientnegative/udf_trunc_error3.q (line 1)


I think it'll be good to add tests with negative numbers as well as no-op 
(e.g. select trunc (12.34, 100).


- Vineet Garg


On Nov. 30, 2016, 7:04 p.m., Chinna Rao Lalam wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53983/
> ---
> 
> (Updated Nov. 30, 2016, 7:04 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Overload trunc() function to accept numbers.
> 
> Now trunc() will accept date or number type arguments and it will behave as 
> below
> 
> trunc(date, fmt) / trunc(N,D) - Returns
> 
> If input is date returns date with the time portion of the day truncated to 
> the unit specified by the format model fmt. 
> If you omit fmt, then date is truncated to "the nearest day. It now only 
> supports 'MONTH'/'MON'/'MM' and 'YEAR'/''/'YY' as format.
> 
> If input is a number group returns N truncated to D decimal places. If D is 
> omitted, then N is truncated to 0 places.
> D can be negative to truncate (make zero) D digits left of the decimal point.
> 
> 
> Diffs
> -
> 
>   data/files/trunc_number.txt PRE-CREATION 
>   data/files/trunc_number1.txt PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrunc.java 
> e20ad65 
>   ql/src/test/queries/clientnegative/udf_trunc_error3.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/udf_trunc_number.q PRE-CREATION 
>   ql/src/test/results/clientnegative/udf_trunc_error1.q.out 5d65b11 
>   ql/src/test/results/clientnegative/udf_trunc_error2.q.out 55a2185 
>   ql/src/test/results/clientnegative/udf_trunc_error3.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/udf_trunc.q.out 4c9f76d 
>   ql/src/test/results/clientpositive/udf_trunc_number.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/53983/diff/
> 
> 
> Testing
> ---
> 
> All tests are pass.
> 
> 
> Thanks,
> 
> Chinna Rao Lalam
> 
>



[jira] [Created] (HIVE-15322) Skipping "hbase mapredcp" in hive script for certain services

2016-11-30 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-15322:
-

 Summary: Skipping "hbase mapredcp" in hive script for certain 
services
 Key: HIVE-15322
 URL: https://issues.apache.org/jira/browse/HIVE-15322
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai


"hbase mapredcp" is intended to append hbase classpath to hive. However, the 
command can take some time when the system is heavy loaded. In some extreme 
cases, we saw ~20s delay due to it. For certain commands, such as "schemaTool", 
hbase classpath is certainly useless, and we can safely skip invoking it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15323) allow the user to turn off reduce-side SMB join

2016-11-30 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-15323:
---

 Summary: allow the user to turn off reduce-side SMB  join
 Key: HIVE-15323
 URL: https://issues.apache.org/jira/browse/HIVE-15323
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15324) Enable round() function to accept scale argument as non-constants

2016-11-30 Thread Chinna Rao Lalam (JIRA)
Chinna Rao Lalam created HIVE-15324:
---

 Summary: Enable round() function to accept scale argument as 
non-constants
 Key: HIVE-15324
 URL: https://issues.apache.org/jira/browse/HIVE-15324
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam


round() function should accept  scale argument as non-constants, it will enable 
queries like: 
{quote}
create table sampletable(c double, d int);
select round(c,d) from sampletable;
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-15325) Add tests with negative numbers as well as no-op tests

2016-11-30 Thread Chinna Rao Lalam (JIRA)
Chinna Rao Lalam created HIVE-15325:
---

 Summary: Add tests with negative numbers as well as no-op tests
 Key: HIVE-15325
 URL: https://issues.apache.org/jira/browse/HIVE-15325
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
Priority: Minor


Add tests with negative numbers as well as no-op (e.g. select trunc (12.34, 
100))



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 53983: HIVE-14582 : Add trunc(numeric) udf

2016-11-30 Thread Chinna Rao Lalam


> On Dec. 1, 2016, 12:05 a.m., Vineet Garg wrote:
> > ql/src/test/queries/clientnegative/udf_trunc_error3.q, line 1
> > 
> >
> > I think it'll be good to add tests with negative numbers as well as 
> > no-op (e.g. select trunc (12.34, 100).

Thanks for the review. I will add tests as part of this JIRA HIVE-15325


- Chinna Rao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53983/#review157471
---


On Nov. 30, 2016, 7:04 p.m., Chinna Rao Lalam wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53983/
> ---
> 
> (Updated Nov. 30, 2016, 7:04 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Overload trunc() function to accept numbers.
> 
> Now trunc() will accept date or number type arguments and it will behave as 
> below
> 
> trunc(date, fmt) / trunc(N,D) - Returns
> 
> If input is date returns date with the time portion of the day truncated to 
> the unit specified by the format model fmt. 
> If you omit fmt, then date is truncated to "the nearest day. It now only 
> supports 'MONTH'/'MON'/'MM' and 'YEAR'/''/'YY' as format.
> 
> If input is a number group returns N truncated to D decimal places. If D is 
> omitted, then N is truncated to 0 places.
> D can be negative to truncate (make zero) D digits left of the decimal point.
> 
> 
> Diffs
> -
> 
>   data/files/trunc_number.txt PRE-CREATION 
>   data/files/trunc_number1.txt PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrunc.java 
> e20ad65 
>   ql/src/test/queries/clientnegative/udf_trunc_error3.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/udf_trunc_number.q PRE-CREATION 
>   ql/src/test/results/clientnegative/udf_trunc_error1.q.out 5d65b11 
>   ql/src/test/results/clientnegative/udf_trunc_error2.q.out 55a2185 
>   ql/src/test/results/clientnegative/udf_trunc_error3.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/udf_trunc.q.out 4c9f76d 
>   ql/src/test/results/clientpositive/udf_trunc_number.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/53983/diff/
> 
> 
> Testing
> ---
> 
> All tests are pass.
> 
> 
> Thanks,
> 
> Chinna Rao Lalam
> 
>