Hi Deb, Putting your code on github will be much appreciated -- it will give us a good starting point to adapt for our purposes.
Regards. On Sat, Jun 28, 2014 at 10:57 AM, Debasish Das [via Apache Spark Developers List] <ml-node+s1001551n7110...@n3.nabble.com> wrote: > Factorization problems are non-convex and so both ALS and DSGD will > converge to local minima and it is not clear which minima will be better > than the other until we run both the algorithms and see... > > So I will still say get a DSGD version running in the test setup while you > experiment with the Spark ALS...so that you can see if on your particular > dataset DSGD is converging to a better minima... > > If you want I can put the DSGD code base that I used for experimentation > on > github...I am not sure if Professor Re already put it on github... > > > On Sat, Jun 28, 2014 at 2:46 AM, Krakna H <[hidden email] > <http://user/SendEmail.jtp?type=node&node=7110&i=0>> wrote: > > > Hi Deb, > > > > Thanks so much for your response! At this point, we haven't determined > > which of DSGD/ALS to go with and were waiting on guidance like yours to > > tell us what the right option would be. It looks like ALS seems to be > good > > enough for our purposes. > > > > Regards. > > > > > > On Fri, Jun 27, 2014 at 12:47 PM, Debasish Das [via Apache Spark > Developers > > List] <[hidden email] > <http://user/SendEmail.jtp?type=node&node=7110&i=1>> wrote: > > > > > Hi, > > > > > > In my experiments with Jellyfish I did not see any substantial RMSE > loss > > > over DSGD for Netflix dataset... > > > > > > So we decided to stick with ALS and implemented a family of Quadratic > > > Minimization solvers that stays in the ALS realm but can solve > > interesting > > > constraints(positivity, bounds, L1, equality constrained bounds > etc)...We > > > are going to show it at the Spark Summit...Also ALS structure is > > favorable > > > to matrix factorization use-cases where missing entries means zero and > > you > > > want to compute a global gram matrix using broadcast and use that for > > each > > > Quadratic Minimization for all users/products... > > > > > > Implementing DSGD in the data partitioning that Spark ALS uses will be > > > straightforward but I would be more keen to see a dataset where DSGD > is > > > showing you better RMSEs than ALS.... > > > > > > If you have a dataset where DSGD produces much better result could you > > > please point it to us ? > > > > > > Also you can use Jellyfish to run DSGD benchmarks to compare against > > > ALS...It is multithreaded and if you have good RAM, you should be able > to > > > run fairly large datasets... > > > > > > Be careful about the default Jellyfish...it has been tuned for netflix > > > dataset (regularization, rating normalization etc)...So before you > > compare > > > RMSE make sure ALS and Jellyfish is running same algorithm (L2 > > regularized > > > Quadratic Loss).... > > > > > > Thanks. > > > Deb > > > > > > > > > On Fri, Jun 27, 2014 at 3:40 AM, Krakna H <[hidden email] > > > <http://user/SendEmail.jtp?type=node&node=7098&i=0>> wrote: > > > > > > > Hi all, > > > > > > > > Just found this thread -- is there an update on including DSGD in > > Spark? > > > We > > > > have a project that entails topic modeling on a document-term matrix > > > using > > > > matrix factorization, and were wondering if we should use ALS or > > attempt > > > > writing our own matrix factorization implementation on top of Spark. > > > > > > > > Thanks. > > > > > > > > > > > > > > > > -- > > > > View this message in context: > > > > > > > > > > http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Matrix-Factorization-tp55p7097.html > > > > Sent from the Apache Spark Developers List mailing list archive at > > > > Nabble.com. > > > > > > > > > > > > > ------------------------------ > > > If you reply to this email, your message will be added to the > discussion > > > below: > > > > > > > > > http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Matrix-Factorization-tp55p7098.html > > > To start a new topic under Apache Spark Developers List, email > > > [hidden email] <http://user/SendEmail.jtp?type=node&node=7110&i=2> > > > To unsubscribe from Apache Spark Developers List, click here > > > < > > > > > > > . > > > NAML > > > < > > > http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml > > > > > > > > > > > > > > > > > -- > > View this message in context: > > > http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Matrix-Factorization-tp55p7109.html > > > Sent from the Apache Spark Developers List mailing list archive at > > Nabble.com. > > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Matrix-Factorization-tp55p7110.html > To start a new topic under Apache Spark Developers List, email > ml-node+s1001551n1...@n3.nabble.com > To unsubscribe from Apache Spark Developers List, click here > <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=c2hhbmthcmsrc3lzQGdtYWlsLmNvbXwxfDk3NjU5Mzg0> > . > NAML > <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Matrix-Factorization-tp55p7111.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.