Re: Faster Spark on ORC with Apache ORC

2017-07-13 Thread Jeff Zhang
Awesome, Dong Joon, It's a great improvement. Looking forward its merge. Dong Joon Hyun 于2017年7月12日周三 上午6:53写道: > Hi, All. > > > > Since Apache Spark 2.2 vote passed successfully last week, > > I think it’s a good time for me to ask your opinions again about the > following PR. > > > > https:

Re: Faster Spark on ORC with Apache ORC

2017-07-11 Thread Dong Joon Hyun
Hi, All. Since Apache Spark 2.2 vote passed successfully last week, I think it’s a good time for me to ask your opinions again about the following PR. https://github.com/apache/spark/pull/17980 (+3,887, −86) It’s for the following issues. * SPARK-20728: Make ORCFileFormat configurable be

Re: Faster Spark on ORC with Apache ORC

2017-05-14 Thread Dong Joon Hyun
AM To: dev@spark.apache.org Subject: Re: Faster Spark on ORC with Apache ORC Hi, I have been wondering how much Apache Spark 2.2.0 will be improved more again. This is the prior record from the source code. Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz SQL Single Int Column Scan:

Re: Faster Spark on ORC with Apache ORC

2017-05-12 Thread Dong Joon Hyun
Hi, I have been wondering how much Apache Spark 2.2.0 will be improved more again. This is the prior record from the source code. Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz SQL Single Int Column Scan: Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative -