Re: spark.cleaner.ttl and spark.streaming.unpersist

2014-09-10 Thread Tim Smith
t” is reference-based cleaning mechanism, >>>>> streaming >>>>> data will be removed when out of slide duration. >>>>> >>>>> >>>>> >>>>> Both these two parameter can alleviate the memory occupation of Spa

Re: spark.cleaner.ttl and spark.streaming.unpersist

2014-09-10 Thread Tim Smith
omehow missed that parameter when I was reviewing the documentation, >>> that should do the trick! Thank you! >>> >>> 2014-09-10 2:10 GMT+01:00 Shao, Saisai : >>> >>>> Hi Luis, >>>> >>>> >>>> >>>> The parameter “s

Re: spark.cleaner.ttl and spark.streaming.unpersist

2014-09-10 Thread Yana Kadiyska
, >> that should do the trick! Thank you! >> >> 2014-09-10 2:10 GMT+01:00 Shao, Saisai : >> >> Hi Luis, >>> >>> >>> >>> The parameter “spark.cleaner.ttl” and “spark.streaming.unpersist” can be >>> used to remove useless timeo

Re: spark.cleaner.ttl and spark.streaming.unpersist

2014-09-10 Thread Tim Smith
0, 2014 at 1:43 AM, Luis Ángel Vicente Sánchez < langel.gro...@gmail.com> wrote: > I somehow missed that parameter when I was reviewing the documentation, > that should do the trick! Thank you! > > 2014-09-10 2:10 GMT+01:00 Shao, Saisai : > > Hi Luis, >> >> >>

Re: spark.cleaner.ttl and spark.streaming.unpersist

2014-09-10 Thread Luis Ángel Vicente Sánchez
I somehow missed that parameter when I was reviewing the documentation, that should do the trick! Thank you! 2014-09-10 2:10 GMT+01:00 Shao, Saisai : > Hi Luis, > > > > The parameter “spark.cleaner.ttl” and “spark.streaming.unpersist” can be > used to remove useless timeout s

RE: spark.cleaner.ttl and spark.streaming.unpersist

2014-09-09 Thread Shao, Saisai
Hi Luis, The parameter “spark.cleaner.ttl” and “spark.streaming.unpersist” can be used to remove useless timeout streaming data, the difference is that “spark.cleaner.ttl” is time-based cleaner, it does not only clean streaming input data, but also Spark’s useless metadata; while

spark.cleaner.ttl and spark.streaming.unpersist

2014-09-09 Thread Luis Ángel Vicente Sánchez
ed to use spark.cleaner.ttl and spark.streaming.unpersist together to mitigate that problem. And I also wonder if new RDD are being batched while a RDD is being processed. Regards, Luis