Re: Use Apache ORC in Apache Spark 2.3

2017-08-11 Thread Sean Owen
-private@ list for future replies. This is not a PMC conversation. On Fri, Aug 11, 2017 at 3:17 AM Andrew Ash wrote: > @Reynold no I don't use the HiveCatalog -- I'm using a custom > implementation of ExternalCatalog instead. > > On Thu, Aug 10, 2017 at 3:34 PM, Dong Joon Hyun > wrote: > >> Tha

Re: Use Apache ORC in Apache Spark 2.3

2017-08-10 Thread Andrew Ash
*Thursday, August 10, 2017 at 3:23 PM > *To: *Andrew Ash > *Cc: *Dong Joon Hyun , "dev@spark.apache.org" < > dev@spark.apache.org>, Apache Spark PMC > *Subject: *Re: Use Apache ORC in Apache Spark 2.3 > > > > Do you not use the catalog? > > > > > > On T

Re: Use Apache ORC in Apache Spark 2.3

2017-08-10 Thread Dong Joon Hyun
, August 10, 2017 at 3:23 PM To: Andrew Ash Cc: Dong Joon Hyun , "dev@spark.apache.org" , Apache Spark PMC Subject: Re: Use Apache ORC in Apache Spark 2.3 Do you not use the catalog? On Thu, Aug 10, 2017 at 3:22 PM, Andrew Ash mailto:and...@andrewash.com>> wrote: I would support mov

Re: Use Apache ORC in Apache Spark 2.3

2017-08-10 Thread Reynold Xin
here is no further comments except the last comment(5) from >> Owen in this week. >> >> >> >> Please give your opinion if you think we need some change on the current >> PR (as-is). >> >> FYI, there is one LGTM on the PR (as-is) and no -1 so far. >

Re: Use Apache ORC in Apache Spark 2.3

2017-08-10 Thread Andrew Ash
nt(5) from > Owen in this week. > > > > Please give your opinion if you think we need some change on the current > PR (as-is). > > FYI, there is one LGTM on the PR (as-is) and no -1 so far. > > > > Thank you again for supporting new ORC improvement in A

Re: Use Apache ORC in Apache Spark 2.3

2017-08-10 Thread Dong Joon Hyun
porting new ORC improvement in Apache Spark. Bests, Dongjoon. From: Dong Joon Hyun Date: Friday, August 4, 2017 at 8:05 AM To: "dev@spark.apache.org" Cc: Apache Spark PMC Subject: Use Apache ORC in Apache Spark 2.3 Hi, All. Apache Spark always has been a fast and general engine,

Re: Use Apache ORC in Apache Spark 2.3

2017-08-04 Thread Dong Joon Hyun
Thank you so much, Owen! Bests, Dongjoon. From: Owen O'Malley Date: Friday, August 4, 2017 at 9:59 AM To: Dong Joon Hyun Cc: "dev@spark.apache.org" , Apache Spark PMC Subject: Re: Use Apache ORC in Apache Spark 2.3 The ORC community is really eager to get this work integra

Re: Use Apache ORC in Apache Spark 2.3

2017-08-04 Thread Owen O'Malley
The ORC community is really eager to get this work integrated in to Spark so that Spark users can have fast access to their ORC data. Let us know if we can help the integration. Thanks, Owen On Fri, Aug 4, 2017 at 8:05 AM, Dong Joon Hyun wrote: > Hi, All. > > > > Apache Spark always has been

Use Apache ORC in Apache Spark 2.3

2017-08-04 Thread Dong Joon Hyun
Hi, All. Apache Spark always has been a fast and general engine, and supports Apache ORC inside `sql/hive` module with Hive dependency since Spark 1.4.X (SPARK-2883). However, there are many open issues about `Feature parity for ORC with Parquet (SPARK-20901)` as of today. With new Apache ORC 1