u, Mar 7, 2019 at 12:58 PM wrote:
> Thanks Ryan and Reynold for the information!
>
>
>
> Cheers,
>
> Tyson
>
>
>
> *From:* Ryan Blue
> *Sent:* Wednesday, March 6, 2019 3:47 PM
> *To:* Reynold Xin
> *Cc:* tcon...@gmail.com; Spark Dev List
> *Subject:
Thanks Ryan and Reynold for the information!
Cheers,
Tyson
From: Ryan Blue
Sent: Wednesday, March 6, 2019 3:47 PM
To: Reynold Xin
Cc: tcon...@gmail.com; Spark Dev List
Subject: Re: Hive Hash in Spark
I think this was needed to add support for bucketed Hive tables. Like Tyson
I think this was needed to add support for bucketed Hive tables. Like Tyson
noted, if the other side of a join can be bucketed the same way, then Spark
can use a bucketed join. I have long-term plans to support this in the
DataSourceV2 API, but I don't think we are very close to implementing it
yet
I think they might be used in bucketing? Not 100% sure.
On Wed, Mar 06, 2019 at 1:40 PM, < tcon...@gmail.com > wrote:
>
>
>
> Hi,
>
>
>
>
>
>
>
> I noticed the existence of a Hive Hash partitioning implementation in
> Spark, but also noticed that it’s not being used, and that the Spark