t” is reference-based cleaning mechanism,
>>>>> streaming
>>>>> data will be removed when out of slide duration.
>>>>>
>>>>>
>>>>>
>>>>> Both these two parameter can alleviate the memory occupation of Spa
omehow missed that parameter when I was reviewing the documentation,
>>> that should do the trick! Thank you!
>>>
>>> 2014-09-10 2:10 GMT+01:00 Shao, Saisai :
>>>
>>>> Hi Luis,
>>>>
>>>>
>>>>
>>>> The parameter “s
,
>> that should do the trick! Thank you!
>>
>> 2014-09-10 2:10 GMT+01:00 Shao, Saisai :
>>
>> Hi Luis,
>>>
>>>
>>>
>>> The parameter “spark.cleaner.ttl” and “spark.streaming.unpersist” can be
>>> used to remove useless timeo
0, 2014 at 1:43 AM, Luis Ángel Vicente Sánchez <
langel.gro...@gmail.com> wrote:
> I somehow missed that parameter when I was reviewing the documentation,
> that should do the trick! Thank you!
>
> 2014-09-10 2:10 GMT+01:00 Shao, Saisai :
>
> Hi Luis,
>>
>>
>>
I somehow missed that parameter when I was reviewing the documentation,
that should do the trick! Thank you!
2014-09-10 2:10 GMT+01:00 Shao, Saisai :
> Hi Luis,
>
>
>
> The parameter “spark.cleaner.ttl” and “spark.streaming.unpersist” can be
> used to remove useless timeout s
Hi Luis,
The parameter “spark.cleaner.ttl” and “spark.streaming.unpersist” can be used
to remove useless timeout streaming data, the difference is that
“spark.cleaner.ttl” is time-based cleaner, it does not only clean streaming
input data, but also Spark’s useless metadata; while
ed to use spark.cleaner.ttl and
spark.streaming.unpersist together to mitigate that problem. And I also
wonder if new RDD are being batched while a RDD is being processed.
Regards,
Luis