Re: Spark Data Frame. PreSorded partitions

Michael Artz Tue, 28 Nov 2017 07:43:38 -0800

I'm not sure other than retrieving from a hive table that is already
sorted.  This sounds cool though, would be interested to know this as well


On Nov 28, 2017 10:40 AM, "Николай Ижиков" <nizhikov....@gmail.com> wrote:

> Hello, guys!
>
> I work on implementation of custom DataSource for Spark Data Frame API and
> have a question:
>
> If I have a `SELECT * FROM table1 ORDER BY some_column` query I can sort
> data inside a partition in my data source.
>
> Do I have a built-in option to tell spark that data from each partition
> already sorted?
>
> It seems that Spark can benefit from usage of already sorted partitions.
> By using of distributed merge sort algorithm, for example.
>
> Does it make sense for you?
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Spark Data Frame. PreSorded partitions

Reply via email to