will/when Spark/SparkSQL will support ORCFile format

2014-10-08 Thread James Yu
Didn't see anyone asked the question before, but I was wondering if anyone
knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is
getting more and more popular hi Hive world.

Thanks,
James


Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-08 Thread James Yu
Thanks Mark! I will keep eye on it.

@Evan, I saw people use both format, so I really want to have Spark support
ORCFile.


On Wed, Oct 8, 2014 at 11:12 AM, Mark Hamstra 
wrote:

> https://github.com/apache/spark/pull/2576
>
>
>
> On Wed, Oct 8, 2014 at 11:01 AM, Evan Chan 
> wrote:
>
>> James,
>>
>> Michael at the meetup last night said there was some development
>> activity around ORCFiles.
>>
>> I'm curious though, what are the pros and cons of ORCFiles vs Parquet?
>>
>> On Wed, Oct 8, 2014 at 10:03 AM, James Yu  wrote:
>> > Didn't see anyone asked the question before, but I was wondering if
>> anyone
>> > knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is
>> > getting more and more popular hi Hive world.
>> >
>> > Thanks,
>> > James
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>


Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-09 Thread James Yu
For performance, will foreign data format support, same as native ones?

Thanks,
James


On Wed, Oct 8, 2014 at 11:03 PM, Cheng Lian  wrote:

> The foreign data source API PR also matters here
> https://www.github.com/apache/spark/pull/2475
>
> Foreign data source like ORC can be added more easily and systematically
> after this PR is merged.
>
> On 10/9/14 8:22 AM, James Yu wrote:
>
>> Thanks Mark! I will keep eye on it.
>>
>> @Evan, I saw people use both format, so I really want to have Spark
>> support
>> ORCFile.
>>
>>
>> On Wed, Oct 8, 2014 at 11:12 AM, Mark Hamstra 
>> wrote:
>>
>>  https://github.com/apache/spark/pull/2576
>>>
>>>
>>>
>>> On Wed, Oct 8, 2014 at 11:01 AM, Evan Chan 
>>> wrote:
>>>
>>>  James,
>>>>
>>>> Michael at the meetup last night said there was some development
>>>> activity around ORCFiles.
>>>>
>>>> I'm curious though, what are the pros and cons of ORCFiles vs Parquet?
>>>>
>>>> On Wed, Oct 8, 2014 at 10:03 AM, James Yu  wrote:
>>>>
>>>>> Didn't see anyone asked the question before, but I was wondering if
>>>>>
>>>> anyone
>>>>
>>>>> knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is
>>>>> getting more and more popular hi Hive world.
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>> -
>>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: dev-h...@spark.apache.org
>>>>
>>>>
>>>>
>


Re: will/when Spark/SparkSQL will support ORCFile format

2014-10-09 Thread James Yu
Sounds great, thanks!



On Thu, Oct 9, 2014 at 2:22 PM, Michael Armbrust 
wrote:

> Yes, the foreign sources work is only about exposing a stable set of APIs
> for external libraries to link against (to avoid the spark assembly
> becoming a dependency mess).  The code path these APIs use will be the same
> as that for datasources included in the core spark sql library.
>
> Michael
>
> On Thu, Oct 9, 2014 at 2:18 PM, James Yu  wrote:
>
>> For performance, will foreign data format support, same as native ones?
>>
>> Thanks,
>> James
>>
>>
>> On Wed, Oct 8, 2014 at 11:03 PM, Cheng Lian 
>> wrote:
>>
>> > The foreign data source API PR also matters here
>> > https://www.github.com/apache/spark/pull/2475
>> >
>> > Foreign data source like ORC can be added more easily and systematically
>> > after this PR is merged.
>> >
>> > On 10/9/14 8:22 AM, James Yu wrote:
>> >
>> >> Thanks Mark! I will keep eye on it.
>> >>
>> >> @Evan, I saw people use both format, so I really want to have Spark
>> >> support
>> >> ORCFile.
>> >>
>> >>
>> >> On Wed, Oct 8, 2014 at 11:12 AM, Mark Hamstra > >
>> >> wrote:
>> >>
>> >>  https://github.com/apache/spark/pull/2576
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Oct 8, 2014 at 11:01 AM, Evan Chan 
>> >>> wrote:
>> >>>
>> >>>  James,
>> >>>>
>> >>>> Michael at the meetup last night said there was some development
>> >>>> activity around ORCFiles.
>> >>>>
>> >>>> I'm curious though, what are the pros and cons of ORCFiles vs
>> Parquet?
>> >>>>
>> >>>> On Wed, Oct 8, 2014 at 10:03 AM, James Yu  wrote:
>> >>>>
>> >>>>> Didn't see anyone asked the question before, but I was wondering if
>> >>>>>
>> >>>> anyone
>> >>>>
>> >>>>> knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is
>> >>>>> getting more and more popular hi Hive world.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> James
>> >>>>>
>> >>>> -
>> >>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> >>>> For additional commands, e-mail: dev-h...@spark.apache.org
>> >>>>
>> >>>>
>> >>>>
>> >
>>
>
>