Maybe it would make sense to loop in the paper authors? I imagine they
might have more information than ended up in the paper.

On Mon, Dec 31, 2018 at 2:10 PM Ryan Blue <rb...@netflix.com.invalid> wrote:

> After a quick look, I don't think that the paper's
> <https://www.computer.org/csdl/proceedings/hipc/2016/5411/00/07839705.pdf>
> evaluation is very thorough. I don't see where it discusses what the
> PageRank implementation is doing in terms of object allocation or whether
> data is cached between iterations (looks like it probably isn't, based on
> Table III). It also doesn't address how this would interact with
> spark.memory.fraction. I think it would be a problem to set this threshold
> lower than spark.memory.fraction. And it doesn't say whether this is static
> or dynamic allocation.
>
> My impression is that this is obviously a good idea for some
> allocation-heavy iterative workloads, but it is unclear whether it would
> help generally:
>
> * An empty executor may delay starting tasks because of the optimistic GC
> * Full GC instead of incremental may not be needed and could increase
> starting delay
> * 1-core executors will always GC between tasks
> * Spark-managed memory may cause long GC pauses that don't recover much
> space
> * Dynamic allocation probably eliminates most of the benefit because of
> executor turn-over
>
> rb
>
> On Mon, Dec 31, 2018 at 11:01 AM Reynold Xin <r...@databricks.com> wrote:
>
>> Not sure how reputable or representative that paper is...
>>
>> On Mon, Dec 31, 2018 at 10:57 AM Sean Owen <sro...@gmail.com> wrote:
>>
>>> https://github.com/apache/spark/pull/23401
>>>
>>> Interesting PR; I thought it was not worthwhile until I saw a paper
>>> claiming this can speed things up to the tune of 2-6%. Has anyone
>>> considered this before?
>>>
>>> Sean
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Reply via email to