Re: sorting in hive -- general

max scalf Mon, 09 Mar 2015 12:59:39 -0700

Thank you...

On Mon, Mar 9, 2015 at 2:23 AM, r7raul1...@163.com <r7raul1...@163.com>
wrote:


> read this article
> http://www.philippeadjiman.com/blog/2009/12/20/hadoop-tutorial-series-issue-2-getting-started-with-customized-partitioning/
>
>
> then read
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy
>
> ------------------------------
> r7raul1...@163.com
>
>
> *From:* max scalf <oracle.bl...@gmail.com>
> *Date:* 2015-03-08 07:02
> *To:* HDP mailing list <u...@hadoop.apache.org>; Hive Mailing List
> <user@hive.apache.org>
> *Subject:* sorting in hive -- general
> Hello all,
>
> I am a new to hadoop and hive in general and i am reading "hadoop the
> definitive guide" by Tom White and on page 504 for the hive chapter, Tom
> says below with regards to soritng
>
> *Sorting and Aggregating*
> *Sorting data in Hive can be achieved by using a standard ORDER BY clause.
> ORDER BY performs a parallel total sort of the input (like that described
> in “Total Sort” on page 261). When a globally sorted result is not
> required—and in many cases it isn’t—you can use Hive’s nonstandard
> extension, SORT BY, instead. SORT BY produces a sorted file per reducer.*
>
>
> My Questions is, what exactly does he mean by "globally sorted result"?,
> if the sort by operation produces a sorted file per reducer does that mean
> at the end of the sort all the reducer are put back together to give the
> correct results ?
>
>
>
>

Re: sorting in hive -- general

Reply via email to