What do you want to do with the results of the query? Henry
On Wed, Mar 29, 2017 at 12:00 PM, szep.laszlo.it <szep.laszlo...@gmail.com> wrote: > Hi, > > after I created a dataset > > Dataset<Row> df = sqlContext.sql("query"); > > I need to have a result values and I call a method: collectAsList() > > List<Row> list = df.collectAsList(); > > But it's very slow, if I work with large datasets (20-30 million records). > I > know, that the result isn't presented in driver app, that's why it takes > long time, because collectAsList() collect all data from worker nodes. > > But then what is the right way to get result values? Is there an other > solution to iterate over a result dataset rows, or get values? Can anyone > post a small & working example? > > Thanks & Regards, > Laszlo Szep > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Alternatives-for-dataframe- > collectAsList-tp28547.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Paul Henry Tremblay Robert Half Technology