hua
>
>
> -----邮件原件-
> 发件人: Liang-Chi Hsieh [mailto:
> viirya@
> ]
> 发送时间: 2017年1月18日 15:48
> 收件人:
> dev@.apache
> 主题: Re: Limit Query Performance Suggestion
>
>
> Hi Sujith,
>
> I saw your updated post. Seems it makes sense to me now.
>
: 2017年1月18日 15:48
收件人: dev@spark.apache.org
主题: Re: Limit Query Performance Suggestion
Hi Sujith,
I saw your updated post. Seems it makes sense to me now.
If you use a very big limit number, the shuffling before `GlobalLimit` would be
a bottleneck for performance, of course, even it can even
-Chi Hsieh | @viirya
Spark Technology Center
http://www.spark.tc/
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Limit-Query-Performance-Suggestion-tp20570p20652.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
--
ort with sample data and also
figuring out a solution for this problem.
Please let me know for any clarifications or suggestions regarding this
issue.
Regards,
Sujith
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Limit-Query-Performance-Suggest
suggestions or solution.
>
> Thanks in advance,
> Sujith
-
Liang-Chi Hsieh | @viirya
Spark Technology Center
http://www.spark.tc/
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Limit-Query-Performance-Suggestion-tp20570p20607.
When limit is being added in the terminal of the physical plan there will
be possibility of memory bottleneck
if the limit value is too large and system will try to aggregate all the
partition limit values as part of single partition.
Description:
Eg:
create table src_temp as select * from src limi