Thanks Michael for confirming!
On Thu, Jul 31, 2014 at 2:43 PM, Michael Armbrust
wrote:
> The performance should be the same using the DSL or SQL strings.
>
>
> On Thu, Jul 31, 2014 at 2:36 PM, Buntu Dev wrote:
>
>> I was not sure if registerAsTable() and then query against that table
>> have
The performance should be the same using the DSL or SQL strings.
On Thu, Jul 31, 2014 at 2:36 PM, Buntu Dev wrote:
> I was not sure if registerAsTable() and then query against that table have
> additional performance impact and if DSL eliminates that.
>
>
> On Thu, Jul 31, 2014 at 2:33 PM, Zong
I was not sure if registerAsTable() and then query against that table have
additional performance impact and if DSL eliminates that.
On Thu, Jul 31, 2014 at 2:33 PM, Zongheng Yang wrote:
> Looking at what this patch [1] has to do to achieve it, I am not sure
> if you can do the same thing in 1.
Looking at what this patch [1] has to do to achieve it, I am not sure
if you can do the same thing in 1.0.0 using DSL only. Just curious,
why don't you use the hql() / sql() methods and pass a query string
in?
[1] https://github.com/apache/spark/pull/1211/files
On Thu, Jul 31, 2014 at 2:20 PM, Bu
Thanks Zongheng for the pointer. Is there a way to achieve the same in
1.0.0 ?
On Thu, Jul 31, 2014 at 1:43 PM, Zongheng Yang wrote:
> countDistinct is recently added and is in 1.0.2. If you are using that
> or the master branch, you could try something like:
>
> r.select('keyword, countDis
countDistinct is recently added and is in 1.0.2. If you are using that
or the master branch, you could try something like:
r.select('keyword, countDistinct('userId)).groupBy('keyword)
On Thu, Jul 31, 2014 at 12:27 PM, buntu wrote:
> I'm looking to write a select statement to get a distinct c