Re: Benefit of ORC format storing Sum, Min, Max...

2015-05-29 Thread Gopal Vijayaraghavan
> I am new to Hive, please help me understand the benefit of ORC file >format storing Sum, Min, Max values. > Whenever we try to find a sum of values in a particular column, it still >runs the MapReduce job. ORC uses row-indexes to constraint filtering. What you¹re looking at is the ORC file foo

Re: Benefit of ORC format storing Sum, Min, Max...

2015-05-29 Thread Jagat Singh
Did you do table column stats On 30 May 2015 9:04 am, "sreejesh s" wrote: > Hi, > > I am new to Hive, please help me understand the benefit of ORC file format > storing Sum, Min, Max values. > Whenever we try to find a sum of values in a particular column, it still > runs the MapReduce job. > > s

Benefit of ORC format storing Sum, Min, Max...

2015-05-29 Thread sreejesh s
Hi, I am new to Hive, please help me understand the benefit of ORC file format storing Sum, Min, Max values.Whenever we try to find a sum of values in a particular column, it still runs the MapReduce job. select sum(col1) from orctable;select sum(col1) from txttable; For a sample file with around

Re: only timestamp column value of previous row gets reset

2015-05-29 Thread Ujjwal
Hi all, The issue can be reproduced in a simple java program (code attached for reference/use) where I do not use the iterator right away after reading, but store it in a vector for later use. As per my understanding, the iterator should not change once given to the consumer. However the timestamp

Re: Hive over JDBC: Retrieving job Id and status

2015-05-29 Thread Hari Subramaniyan
Hi Kiran, For Async calls, see https://github.com/apache/hive/blob/master/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithMr.java#L83 The client in this case uses MiniHS2.getClient() which uses JDBC internally. The method suggested involves log

Re: cast column float

2015-05-29 Thread patcharee
Hi, There is really some data, but with these two queries return nothing... so weird select count(*) from u where xlong_u = 7.1578474 and xlat_u = 55.192524; select count(*) from u where xlong_u = cast(7.1578474 as float) and xlat_u = cast(55.192524 as float); Here is sample data. select x

RE: Hive over JDBC: Retrieving job Id and status

2015-05-29 Thread Lonikar, Kiran
Hi Hari, I am using hive 0.13 and above. Thanks for the info. The example you provided uses cli and does not seem like using JDBC. The JDBC calls are blocking (no API like executeStatementAsync). Does the class org.apache.hive.service.cli.CLIServiceClient use JDBC internally? Does it involve l

Re: Hive over JDBC: Retrieving job Id and status

2015-05-29 Thread Hari Subramaniyan
Hi ?Kiran, Which version of Hive are you using. In 1.2 release, we have an option to set session level logging from client via hive.server2.logging.operation.level. Setting this parameter to EXECUTION level should provide map-red job information associated with the query at the client side,