True. The spec does not mandate the bucket files have to be there if they
are empty. (missing directories are 0 row tables).
Thanks,
Edward
On Tue, Apr 3, 2018 at 4:42 PM, Richard A. Bross wrote:
> Gopal,
>
> The Presto devs say they are willing to make the changes to adhere to the
> Hive bucke
Gopal,
The Presto devs say they are willing to make the changes to adhere to the Hive
bucket spec. I quoted
"Presto could fix their fail-safe for bucketing implementation to actually
trust the Hive bucketing spec & get you out of this mess - the bucketing
contract for Hive is actual file nam
Gopal,
Thanks for this. Great information and something to look at more closely to
better understand the internals.
Rick
- Original Message -
From: "Gopal Vijayaraghavan"
To: user@hive.apache.org
Sent: Tuesday, April 3, 2018 3:15:46 AM
Subject: Re: Hive, Tez, clustering, buckets, and
>* I'm interested in your statement that CLUSTERED BY does not CLUSTER BY.
> My understanding was that this was related to the number of buckets, but you
> are relating it to ORC stripes. It is odd that no examples that I've seen
> include the SORTED BY statement other than in relation to