will/when Spark/SparkSQL will support ORCFile format
Didn't see anyone asked the question before, but I was wondering if anyone knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is getting more and more popular hi Hive world. Thanks, James
Re: will/when Spark/SparkSQL will support ORCFile format
Thanks Mark! I will keep eye on it. @Evan, I saw people use both format, so I really want to have Spark support ORCFile. On Wed, Oct 8, 2014 at 11:12 AM, Mark Hamstra wrote: > https://github.com/apache/spark/pull/2576 > > > > On Wed, Oct 8, 2014 at 11:01 AM, Evan Chan > wrote: > >> James, >> >> Michael at the meetup last night said there was some development >> activity around ORCFiles. >> >> I'm curious though, what are the pros and cons of ORCFiles vs Parquet? >> >> On Wed, Oct 8, 2014 at 10:03 AM, James Yu wrote: >> > Didn't see anyone asked the question before, but I was wondering if >> anyone >> > knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is >> > getting more and more popular hi Hive world. >> > >> > Thanks, >> > James >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> For additional commands, e-mail: dev-h...@spark.apache.org >> >> >
Re: will/when Spark/SparkSQL will support ORCFile format
For performance, will foreign data format support, same as native ones? Thanks, James On Wed, Oct 8, 2014 at 11:03 PM, Cheng Lian wrote: > The foreign data source API PR also matters here > https://www.github.com/apache/spark/pull/2475 > > Foreign data source like ORC can be added more easily and systematically > after this PR is merged. > > On 10/9/14 8:22 AM, James Yu wrote: > >> Thanks Mark! I will keep eye on it. >> >> @Evan, I saw people use both format, so I really want to have Spark >> support >> ORCFile. >> >> >> On Wed, Oct 8, 2014 at 11:12 AM, Mark Hamstra >> wrote: >> >> https://github.com/apache/spark/pull/2576 >>> >>> >>> >>> On Wed, Oct 8, 2014 at 11:01 AM, Evan Chan >>> wrote: >>> >>> James, >>>> >>>> Michael at the meetup last night said there was some development >>>> activity around ORCFiles. >>>> >>>> I'm curious though, what are the pros and cons of ORCFiles vs Parquet? >>>> >>>> On Wed, Oct 8, 2014 at 10:03 AM, James Yu wrote: >>>> >>>>> Didn't see anyone asked the question before, but I was wondering if >>>>> >>>> anyone >>>> >>>>> knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is >>>>> getting more and more popular hi Hive world. >>>>> >>>>> Thanks, >>>>> James >>>>> >>>> - >>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>> >>>> >>>> >
Re: will/when Spark/SparkSQL will support ORCFile format
Sounds great, thanks! On Thu, Oct 9, 2014 at 2:22 PM, Michael Armbrust wrote: > Yes, the foreign sources work is only about exposing a stable set of APIs > for external libraries to link against (to avoid the spark assembly > becoming a dependency mess). The code path these APIs use will be the same > as that for datasources included in the core spark sql library. > > Michael > > On Thu, Oct 9, 2014 at 2:18 PM, James Yu wrote: > >> For performance, will foreign data format support, same as native ones? >> >> Thanks, >> James >> >> >> On Wed, Oct 8, 2014 at 11:03 PM, Cheng Lian >> wrote: >> >> > The foreign data source API PR also matters here >> > https://www.github.com/apache/spark/pull/2475 >> > >> > Foreign data source like ORC can be added more easily and systematically >> > after this PR is merged. >> > >> > On 10/9/14 8:22 AM, James Yu wrote: >> > >> >> Thanks Mark! I will keep eye on it. >> >> >> >> @Evan, I saw people use both format, so I really want to have Spark >> >> support >> >> ORCFile. >> >> >> >> >> >> On Wed, Oct 8, 2014 at 11:12 AM, Mark Hamstra > > >> >> wrote: >> >> >> >> https://github.com/apache/spark/pull/2576 >> >>> >> >>> >> >>> >> >>> On Wed, Oct 8, 2014 at 11:01 AM, Evan Chan >> >>> wrote: >> >>> >> >>> James, >> >>>> >> >>>> Michael at the meetup last night said there was some development >> >>>> activity around ORCFiles. >> >>>> >> >>>> I'm curious though, what are the pros and cons of ORCFiles vs >> Parquet? >> >>>> >> >>>> On Wed, Oct 8, 2014 at 10:03 AM, James Yu wrote: >> >>>> >> >>>>> Didn't see anyone asked the question before, but I was wondering if >> >>>>> >> >>>> anyone >> >>>> >> >>>>> knows if Spark/SparkSQL will support ORCFile format soon? ORCFile is >> >>>>> getting more and more popular hi Hive world. >> >>>>> >> >>>>> Thanks, >> >>>>> James >> >>>>> >> >>>> - >> >>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> >>>> For additional commands, e-mail: dev-h...@spark.apache.org >> >>>> >> >>>> >> >>>> >> > >> > >