the uniqueSource in StreamExecution, where is it be changed please?

2017-08-04 Thread ??????????
Hi all, These days I am learning the code about the StreamExecution. In the method constructNextBatch(about line 365), I found the value of latestOffsets changed but I can not find where the s.getOffset of uniqueSource is changed. here is the code link: https://github.com/apache/spark/blob/m

Re: Use Apache ORC in Apache Spark 2.3

2017-08-04 Thread Dong Joon Hyun
Thank you so much, Owen! Bests, Dongjoon. From: Owen O'Malley Date: Friday, August 4, 2017 at 9:59 AM To: Dong Joon Hyun Cc: "dev@spark.apache.org" , Apache Spark PMC Subject: Re: Use Apache ORC in Apache Spark 2.3 The ORC community is really eager to get this work integrated in to Spark so

Re: Use Apache ORC in Apache Spark 2.3

2017-08-04 Thread Owen O'Malley
The ORC community is really eager to get this work integrated in to Spark so that Spark users can have fast access to their ORC data. Let us know if we can help the integration. Thanks, Owen On Fri, Aug 4, 2017 at 8:05 AM, Dong Joon Hyun wrote: > Hi, All. > > > > Apache Spark always has been

Use Apache ORC in Apache Spark 2.3

2017-08-04 Thread Dong Joon Hyun
Hi, All. Apache Spark always has been a fast and general engine, and supports Apache ORC inside `sql/hive` module with Hive dependency since Spark 1.4.X (SPARK-2883). However, there are many open issues about `Feature parity for ORC with Parquet (SPARK-20901)` as of today. With new Apache ORC 1