Hello!

I’m Robert Kaye from the MetaBrainz Foundation — we’re the people behind 
MusicBrainz ( https://musicbrainz.org <https://musicbrainz.org/> ) and more 
recently ListenBrainz ( https://listenbrainz.org <https://listenbrainz.org/> ). 
ListenBrainz is aiming to re-create what last.fm <http://last.fm/> used to be — 
we’ve already got 200M listens (AKA scrabbles) from our users (which is not a 
lot, really). We’ve setup an Apache Spark cluster and are starting to build 
user listening statistics using this setup.

While our setup is working, we can see that we’re not going to scale up well 
given our current approach. We’ve been trying to read the docs, ask for help on 
the IRC channel, but we continue to miss import bits about how we should be 
doing things. Best practices around Spark seem to be hard to come by. :(

MetaBrainz is all open source and open data — any of the data we use is 
available for anyone to download — we’re a non-profit working hard towards 
creating open source music recommendation engines. We’re hoping that someone 
could take us under their wing, turn up in our IRC channel and help us find the 
right path towards using Spark much more effectively than we’ve been so far.

Is anyone on this list interested in helping out? Perhaps you know someone who 
might?

Thanks!

--

--ruaok        

Robert Kaye     --     r...@metabrainz.org     --    http://metabrainz.org

Reply via email to