As Keith said, it depends on what you want to do with your data.
>From a pipelining perspective the general flow (YMMV) is:
Load dataset(s) -> Transform and / or Join --> Aggregate --> Write dataset
Each step in the pipeline does something distinct with the data.
The end step is usually loading
As Paul said it really depends on what you want to do with your data,
perhaps writing it to a file would be a better option, but again it depends
on what you want to do with the data you collect.
Regards,
Keith.
http://keith-chapman.com
On Tue, Apr 4, 2017 at 7:38 AM, Eike von Seggern
wrote:
>
Hi,
depending on what you're trying to achieve `RDD.toLocalIterator()` might
help you.
Best
Eike
2017-03-29 21:00 GMT+02:00 szep.laszlo.it :
> Hi,
>
> after I created a dataset
>
> Dataset df = sqlContext.sql("query");
>
> I need to have a result values and I call a method: collectAsList()
>
What do you want to do with the results of the query?
Henry
On Wed, Mar 29, 2017 at 12:00 PM, szep.laszlo.it
wrote:
> Hi,
>
> after I created a dataset
>
> Dataset df = sqlContext.sql("query");
>
> I need to have a result values and I call a method: collectAsList()
>
> List list = df.collectAsL