Spark history server running on Mongo

2017-07-18 Thread Ivan Sadikov
Hello everyone! I have been working on Spark history server that uses MongoDB as a datastore for processed events to iterate on idea that Spree project uses for Spark UI. Project was originally designed to improve on standalone history server with reduced memory footprint. Project lives here: htt

Re: Spark history server running on Mongo

2017-07-18 Thread Marcelo Vanzin
See SPARK-18085. That has much of the same goals re: SHS resource usage, and also provides a (currently non-public) API where you could just create a MongoDB implementation if you want. On Tue, Jul 18, 2017 at 12:56 AM, Ivan Sadikov wrote: > Hello everyone! > > I have been working on Spark histor

Re: Spark history server running on Mongo

2017-07-18 Thread Ivan Sadikov
Thanks for JIRA ticket reference! Frankly, I was aware of this work, but didn't know that there was an API for storage implementation. Will try exploring that as well, thanks! On Wed, 19 Jul 2017 at 4:18 AM, Marcelo Vanzin wrote: > See SPARK-18085. That has much of the same goals re: SHS resourc

Re: Spark history server running on Mongo

2017-07-18 Thread Ivan Sadikov
Hi Marcelo, Thanks for the reference, again. I looked at your code - really great work! I had to replace Spark distribution to use it though - could not figure out how to build it separately. Repository that I linked to does not require rebuilding Spark and could be used with current distribution