Maybe it would make sense to loop in the paper authors? I imagine they might have more information than ended up in the paper.
On Mon, Dec 31, 2018 at 2:10 PM Ryan Blue <rb...@netflix.com.invalid> wrote: > After a quick look, I don't think that the paper's > <https://www.computer.org/csdl/proceedings/hipc/2016/5411/00/07839705.pdf> > evaluation is very thorough. I don't see where it discusses what the > PageRank implementation is doing in terms of object allocation or whether > data is cached between iterations (looks like it probably isn't, based on > Table III). It also doesn't address how this would interact with > spark.memory.fraction. I think it would be a problem to set this threshold > lower than spark.memory.fraction. And it doesn't say whether this is static > or dynamic allocation. > > My impression is that this is obviously a good idea for some > allocation-heavy iterative workloads, but it is unclear whether it would > help generally: > > * An empty executor may delay starting tasks because of the optimistic GC > * Full GC instead of incremental may not be needed and could increase > starting delay > * 1-core executors will always GC between tasks > * Spark-managed memory may cause long GC pauses that don't recover much > space > * Dynamic allocation probably eliminates most of the benefit because of > executor turn-over > > rb > > On Mon, Dec 31, 2018 at 11:01 AM Reynold Xin <r...@databricks.com> wrote: > >> Not sure how reputable or representative that paper is... >> >> On Mon, Dec 31, 2018 at 10:57 AM Sean Owen <sro...@gmail.com> wrote: >> >>> https://github.com/apache/spark/pull/23401 >>> >>> Interesting PR; I thought it was not worthwhile until I saw a paper >>> claiming this can speed things up to the tune of 2-6%. Has anyone >>> considered this before? >>> >>> Sean >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> > > -- > Ryan Blue > Software Engineer > Netflix > -- Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> YouTube Live Streams: https://www.youtube.com/user/holdenkarau