he recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.
From: Alan Gates [mailto:alanfga...@gmail.com]
Sent: 06 January 2016 18:19
To: user@hive.apache.org
Subject: Re: Indexes in Hive
The issue wi
I am not sure how much performance one could gain in comparison to ORC or
Parquet. They work pretty well once you know how to use them. However,
there is still ways to optimize them. For instance, sorting of data is a
key factor for these formats to be efficient. Nevertheless, if you have a
lot of
The issue with this is that HDFS lacks the ability to co-locate blocks.
So if you break your columns into one file per column (the more
traditional column route) you end up in a situation where 2/3 of the
time only one of your columns is being locally read, which results in a
significant perfo
ache.org
Subject: Re: Indexes in Hive
If I understand you correctly this could be just another Hive storage
format.
> On 06 Jan 2016, at 07:24, Mich Talebzadeh wrote:
>
> Hi,
>
> Thinking loudly.
>
> Ideally we should consider a totally columnar storage offering in
> which
If I understand you correctly this could be just another Hive storage format.
> On 06 Jan 2016, at 07:24, Mich Talebzadeh wrote:
>
> Hi,
>
> Thinking loudly.
>
> Ideally we should consider a totally columnar storage offering in which each
> column of table is stored as compressed value (I disr