Re: Hive Bucketing Support

Abhijeet Kumar Wed, 06 Jun 2018 22:52:14 -0700

I would ask my queries here <https://gitter.im/spark-scala/Lobby>.


Thanks,
Abhijeet Kumar

> On 07-Jun-2018, at 1:03 AM, Chris Martin <[email protected]> wrote:
> 
> Hi All,
> 
> 
> first off apologies if this is not the correct place to ask this!
> 
> I've been following SPARK-19256 
> <https://issues.apache.org/jira/browse/SPARK-19256> (Hive Bucketing Support) 
> with interest for some time now as we do a relatively large amount of our 
> data processing in Spark but use Hive for business analytics.  As a result we 
> end up writing a non-trivial amount of data out twice; once in parquet 
> optimized for Spark and once in once in orc optimized for Hive!  The hope is 
> that SPARK-19256 will put an end to this.
> 
> I've noticed that there a PR (https://github.com/apache/spark/pull/19001 
> <https://github.com/apache/spark/pull/19001>) that's been open for almost a 
> year now, with the last comment being over a month ago.  Does anyone know if 
> I should remain hopeful that this support will be added in the near future or 
> is it one of those things that's realistically going to be some distance off.
> 
> thanks,
> 
> Chris
> 
> 
>

Re: Hive Bucketing Support

Reply via email to