Re: Spark SQL IN Clause

Michael Armbrust Fri, 04 Dec 2015 11:28:33 -0800

The best way to run this today is probably to manually convert the query
into a join.  I.e. create a dataframe that has all the numbers in it, and
join/outer join it with the other table.  This way you avoid parsing a
gigantic string.


On Fri, Dec 4, 2015 at 10:36 AM, Ted Yu <yuzhih...@gmail.com> wrote:

> Have you seen this JIRA ?
>
> [SPARK-8077] [SQL] Optimization for TreeNodes with large numbers of
> children
>
> From the numbers Michael published, 1 million numbers would still need 250
> seconds to parse.
>
> On Fri, Dec 4, 2015 at 10:14 AM, Madabhattula Rajesh Kumar <
> mrajaf...@gmail.com> wrote:
>
>> Hi,
>>
>> How to use/best practices "IN" clause in Spark SQL.
>>
>> Use Case :-  Read the table based on number. I have a List of numbers.
>> For example, 1million.
>>
>> Regards,
>> Rajesh
>>
>
>

Re: Spark SQL IN Clause

Reply via email to