Take a look at the SparkListener API included in Spark, you can use it to 
capture various events. There’s also this pull request: 
https://github.com/apache/spark/pull/42 that will persist application logs and 
let you rebuild the web UI after the app runs. It uses the same API to log 
events.

Matei

On Mar 17, 2014, at 7:35 AM, Roman Pastukhov <metaignat...@gmail.com> wrote:

> Hi.
> 
> We're thinking about writing a tool that would read Spark logs and output 
> cache contents at some point in time (e.g. if you want to see what data fills 
> the cache and whether some of it may be unpersisted to improve performance).
> 
> Are there similar projects that already exist? Is there a list of 
> Spark-related tools? There is Spark debugger/SRD 
> (https://github.com/mesos/spark/wiki/Spark-Debugger, 
> http://spark-replay-debugger-overview.readthedocs.org/en/latest/) but I 
> couldn't find any links to them on the Spark project site.

Reply via email to