AFAIK spark has no public APIs to clean up those RDDs.
On Wed, Jan 25, 2017 at 11:30 PM, Andrew Milkowski
wrote:
> Hi Takeshi thanks for the answer, looks like spark would free up old RDD's
> however using admin UI we see ie
>
> Block ID, it corresponds with each receiver and a timestamp.
> For
Hi Takeshi thanks for the answer, looks like spark would free up old RDD's
however using admin UI we see ie
Block ID, it corresponds with each receiver and a timestamp.
For example, block input-0-1485275695898 is from receiver 0 and it was
created at 1485275695898 (1/24/2017, 11:34:55 AM GMT-5:00
Hi,
AFAIK, the blocks of minibatch RDDs are checked every job finished, and
older blocks automatically removed (See:
https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala#L463
).
You can control this behaviour by StreamingContext#rem
hello
using spark 2.0.2 and while running sample streaming app with kinesis
noticed (in admin ui Storage tab) "Stream Blocks" for each worker keeps
climbing up
then also (on same ui page) in Blocks section I see blocks such as below
input-0-1484753367056
that are marked as Memory Serialized