Re: Alternatives for dataframe collectAsList()

2017-04-04 Thread lucas.g...@gmail.com
As Keith said, it depends on what you want to do with your data. >From a pipelining perspective the general flow (YMMV) is: Load dataset(s) -> Transform and / or Join --> Aggregate --> Write dataset Each step in the pipeline does something distinct with the data. The end step is usually loading

Re: Alternatives for dataframe collectAsList()

2017-04-04 Thread Keith Chapman
As Paul said it really depends on what you want to do with your data, perhaps writing it to a file would be a better option, but again it depends on what you want to do with the data you collect. Regards, Keith. http://keith-chapman.com On Tue, Apr 4, 2017 at 7:38 AM, Eike von Seggern wrote: >

Re: Alternatives for dataframe collectAsList()

2017-04-04 Thread Eike von Seggern
Hi, depending on what you're trying to achieve `RDD.toLocalIterator()` might help you. Best Eike 2017-03-29 21:00 GMT+02:00 szep.laszlo.it : > Hi, > > after I created a dataset > > Dataset df = sqlContext.sql("query"); > > I need to have a result values and I call a method: collectAsList() >

Re: Alternatives for dataframe collectAsList()

2017-04-03 Thread Paul Tremblay
What do you want to do with the results of the query? Henry On Wed, Mar 29, 2017 at 12:00 PM, szep.laszlo.it wrote: > Hi, > > after I created a dataset > > Dataset df = sqlContext.sql("query"); > > I need to have a result values and I call a method: collectAsList() > > List list = df.collectAsL