Hi All,
Can custom storage handlers get information for queries like count, max,
min etc. from hive directly so that for each of such queries RecordReader
need not fetch all the records?
Regards,
Amey
On Tue, Mar 22, 2016 at 1:46 PM, Amey Barve wrote:
> Thanks Nitin, Mich,
>
> if its just plai
Thanks Nitin, Mich,
if its just plain vanilla text file format, it needs to run a job to get
the count so the longest of all
--> Hive must be translating some operator like fetch (for count) into a
map-reduce job and getting the result?
Can a custom storage handler get information about the operat
ORC file has the following stats levels for storage indexes
1. ORC File itself
2. Multiple stripes (chunks) within the ORC file
3. Multiple row groups (row batches) within each stripe
Assuming that the underlying table has stats updated, count will be stored
for each column
So when we
If you have enabled performance optimization by enabling statistics it will
come from there
if the underlying file format supports infile statistics (like ORC), it
will come from there
if its just plain vanilla text file format, it needs to run a job to get
the count so the longest of all
On Tue,