Mostly wanted to tell hive it's sorted so it could use more efficient joins
like a map side join.  No other reason
On Aug 3, 2015 10:47 AM, "Ryan Harris" <ryan.har...@zionsbancorp.com> wrote:

> Unless you are using bucketing and sampling, there is no benefit (that I
> can think of) to informing hive that the data **is** in fact sorted...
>
>
>
> If there is something specific you are trying to accomplish by specifying
> the sort order of that column, perhaps you can elaborate on that.
> Otherwise, leave out the 'sorted by' statement and you should be fine.
>
>
>
> *From:* David Capwell [mailto:dcapw...@gmail.com]
> *Sent:* Monday, August 03, 2015 10:50 AM
> *To:* user@hive.apache.org
> *Subject:* Re: External sorted tables
>
>
>
> Based off the ddl it is required to have buckets, I was wondering if there
> was a way to get around it?
>
> Thinking as a hack I could try bucket=1, but if there is a better way
> would love to know
>
> On Aug 2, 2015 6:18 PM, "Takahiko Saito" <tysa...@gmail.com> wrote:
>
> Hi,
>
>
>
> Is it possible that 'create table sorted by' must have buckets?
>
>
>
> I found the below statements in
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL:
>
>
>
> "The CLUSTERED BY and SORTED BY creation commands do not affect how data
> is inserted into a table – only how it is read. This means that users must
> be careful to insert data correctly by specifying the number of reducers to
> be equal to the number of buckets, and using CLUSTER BY and SORT BY
> commands in their query."
>
>
>
> On Thu, Jul 30, 2015 at 7:22 PM, David Capwell <dcapw...@gmail.com> wrote:
>
> We are trying to create a external table in hive. This data is sorted,
> so wanted to tell hive about this. When I do, it complains about
> parsing the create.
>
> > CREATE EXTERNAL TABLE IF NOT EXISTS store.testing (
> ...
> . . . . . . . . . . . . . . . . . . .>   timestamp bigint,
> ...)
> . . . . . . . . . . . . . . . . . . .>   SORTED BY (timestamp)
> ...
> . . . . . . . . . . . . . . . . . . .>   LOCATION '/project/db/table'
> . . . . . . . . . . . . . . . . . . .> ;
> Error: Error while compiling statement: FAILED: ParseException line
> 1:507 missing EOF at 'SORTED' near ')' (state=42000,code=40000)
> 2: jdbc:hive2://localhost:10000/store>
>
> What can I do to let hive know that my data is sorted? Every example
> online of sorted by is grouped with buckets, but we really don't want
> to add bucketing.
>
>
> Hive version: 0.14.0
>
> Thanks for your help!
>
>
>
>
>
> --
>
> Takahiko Saito
> ------------------------------
> THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS
> CONFIDENTIAL and may contain information that is privileged and exempt from
> disclosure under applicable law. If you are neither the intended recipient
> nor responsible for delivering the message to the intended recipient,
> please note that any dissemination, distribution, copying or the taking of
> any action in reliance upon the message is strictly prohibited. If you have
> received this communication in error, please notify the sender immediately.
> Thank you.
>

Reply via email to