repartition the data after
> you apply the limit.
>
> ..Manas
>
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/Problem-using-limit-clause-in-spark-sql-tp25789p25797.html
>
> &l
.
..Manas
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-using-limit-clause-in-spark-sql-tp25789p25797.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
ing as an http server. So I collect the data as
> the response.
>
> 在 2015年12月24日,上午8:22,Hudong Wang 写道:
>
> When you call collect() it will bring all the data to the driver. Do you
> mean to call persist() instead?
>
> --
> From: tiandiwo.
g >> <mailto:justupl...@hotmail.com>> 写道:
>>>
>>> When you call collect() it will bring all the data to the driver. Do you
>>> mean to call persist() instead?
>>>
>>> From: tiandiwo...@icloud.com <mailto:tiandiwo...@icloud.com>
>&
.@icloud.com<mailto:tiandiwo...@icloud.com>
Subject: Problem using limit clause in spark sql
Date: Wed, 23 Dec 2015 21:26:51 +0800
To: user@spark.apache.org<mailto:user@spark.apache.org>
Hi,
I am using spark sql in a way like this:
sqlContext.sql(“select * from table limit 1”).map(...
> Subject: Problem using limit clause in spark sql
> Date: Wed, 23 Dec 2015 21:26:51 +0800
> To: user@spark.apache.org
>
> Hi,
> I am using spark sql in a way like this:
>
> sqlContext.sql(“select * from table limit 1”).map(...).collect()
>
> The problem is that the
is less or equal then 1.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-using-limit-clause-in-spark-sql-tp25789.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
Hi,
I am using spark sql in a way like this:
sqlContext.sql(“select * from table limit 1”).map(...).collect()
The problem is that the limit clause will collect all the 10,000 records into a
single partition, resulting the map afterwards running only in one partition
and being really slow.I