I tried the following:

scala> Seq((2, "a", "test"), (2, "b", "foo")).toDF("id", "a",
"b").registerTempTable("test")

scala> val df = sql("SELECT maxRow.* FROM (SELECT max(struct(id, b, a)) as
maxRow FROM test) a")
df: org.apache.spark.sql.DataFrame = [id: int, b: string ... 1 more field]

scala> df.show
+---+----+---+
| id|   b|  a|
+---+----+---+
|  2|test|  a|
+---+----+---+

Looks like the sort order is governed by the order give in struct().

Nice feature.

On Fri, Feb 26, 2016 at 12:30 PM, Michael Armbrust <mich...@databricks.com>
wrote:

> There will probably be some subquery support in 2.0.  That particular
> query would be more efficient to express as an argmax however.  Here is
> an example in Spark 1.6
> <https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/1023043053387187/3170497669323442/2840265927289860/c072bba513.html>
> .
>
> On Thu, Feb 25, 2016 at 2:58 PM, Mohammad Tariq <donta...@gmail.com>
> wrote:
>
>> AFAIK, this isn't supported yet. A ticket
>> <https://issues.apache.org/jira/browse/SPARK-4226> is in progress though.
>>
>>
>>
>> [image: http://]
>>
>> Tariq, Mohammad
>> about.me/mti
>> [image: http://]
>> <http://about.me/mti>
>>
>>
>> On Fri, Feb 26, 2016 at 4:16 AM, Mich Talebzadeh <
>> mich.talebza...@cloudtechnologypartners.co.uk> wrote:
>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> I guess the following confirms that Spark does bot support sub-queries
>>>
>>>
>>>
>>> val d = HiveContext.table("test.dummy")
>>>
>>> d.registerTempTable("tmp")
>>>
>>> HiveContext.sql("select * from tmp where id IN (select max(id) from
>>> tmp)")
>>>
>>> It crashes
>>>
>>> The SQL works OK in Hive itself on the underlying table!
>>>
>>> select * from dummy where id IN (select max(id) from dummy);
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>> --
>>>
>>> Dr Mich Talebzadeh
>>>
>>> LinkedIn  
>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> http://talebzadehmich.wordpress.com
>>>
>>> NOTE: The information in this email is proprietary and confidential. This 
>>> message is for the designated recipient only, if you are not the intended 
>>> recipient, you should destroy it immediately. Any information in this 
>>> message shall not be understood as given or endorsed by Cloud Technology 
>>> Partners Ltd, its subsidiaries or their employees, unless expressly so 
>>> stated. It is the responsibility of the recipient to ensure that this email 
>>> is virus free, therefore neither Cloud Technology partners Ltd, its 
>>> subsidiaries nor their employees accept any responsibility.
>>>
>>>
>>>
>>
>

Reply via email to