rint.
>>>
>>> This email may contain confidential and privileged material for the sole
>>> use of the intended recipient. Any review, use, distribution or disclosure
>>> by others is strictly prohibited. If you are not the intended recipient (or
>>> authorized to receive for the recipient), please contact the sender by
>>> reply email and delete all copies of this message.
>>>
>>> Please click here
>>> <http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
>>> Company Registration Information.
>>>
>>>
>>>
>>>
>>> From: Alan Gates
>>> Reply-To: "user@hive.apache.org"
>>> Date: Monday, April 27, 2015 at 2:05 PM
>>> To: "user@hive.apache.org"
>>> Subject: Re: ORC file across multiple HDFS blocks
>>>
>>> to cross blocks and hence n
>>>
>>
>>
>
sole
>> use of the intended recipient. Any review, use, distribution or disclosure
>> by others is strictly prohibited. If you are not the intended recipient (or
>> authorized to receive for the recipient), please contact the sender by
>> reply email and delete all copies of this message.
>>
>> Please click here
>> <http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
>> Company Registration Information.
>>
>>
>>
>>
>> From: Alan Gates
>> Reply-To: "user@hive.apache.org"
>> Date: Monday, April 27, 2015 at 2:05 PM
>> To: "user@hive.apache.org"
>> Subject: Re: ORC file across multiple HDFS blocks
>>
>> to cross blocks and hence n
>>
>
>
e sender by
> reply email and delete all copies of this message.
>
> Please click here
> <http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for
> Company Registration Information.
>
>
>
>
> From: Alan Gates
> Reply-To: "user@hive.apache.org"
> Date: Monday, April 27, 2015 at 2:05 PM
> To: "user@hive.apache.org"
> Subject: Re: ORC file across multiple HDFS blocks
>
> to cross blocks and hence n
>
an Gates mailto:alanfga...@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>"
mailto:user@hive.apache.org>>
Date: Monday, April 27, 2015 at 2:05 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>"
mailto:user@hive.apach
No, you don't want to be designing ORC files to not cross block
boundaries. Engines in Hadoop (MapReduce, Tez, etc.) are all built to
handle the fact that files tend to cross blocks and hence nodes. There
is value in lining up stripe size and HDFS block size so that your
stripes don't straddl
hi, Guys,
I am working on directly READ ORC files from HDFS cluster, and hopefully to
leverage HDFS local shortcuit READ (
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html)
as much as possible
According to ORC design, each ORC file usually contain