Hi, I have similar problem. I need matrix operations such as dot product , cross product , transpose, matrix multiplication to be performed on Spark. Does spark has inbuilt API to support these? I see matrix factorization implementation in mlib.
On Fri, Aug 8, 2014 at 12:38 PM, yaochunnan [via Apache Spark User List] < ml-node+s1001560n11765...@n3.nabble.com> wrote: > I think the eigenvalues and eigenvectors you are talking about is that of > M^T*M or M*M^T, if we get M=U*s*V^T as SVD. What I want is to get > eigenvectors and eigenvalues of M itself. Is this my misunderstanding of > linear algebra or the API? > > [image: M^{*} M = V \Sigma^{*} U^{*}\, U \Sigma V^{*} = V (\Sigma^{*} > \Sigma) V^{*}\,] [image: M M^{*} = U \Sigma V^{*} \, V \Sigma^{*} U^{*} = > U (\Sigma \Sigma^{*}) U^{*}\,] > > > > 2014-08-08 11:19 GMT+08:00 x <[hidden email] > <http://user/SendEmail.jtp?type=node&node=11765&i=0>>: > > U.rows.toArray.take(1).foreach(println) and > V.toArray.take(s.size).foreach(println) >> are not eigenvectors corresponding to the biggest eigenvalue >> s.toArray(0)*s.toArray(0)? >> >> xj @ Tokyo >> >> >> On Fri, Aug 8, 2014 at 12:07 PM, Chunnan Yao <[hidden email] >> <http://user/SendEmail.jtp?type=node&node=11765&i=1>> wrote: >> >>> Hi there, what you've suggested are all meaningful. But to make myself >>> clearer, my essential problems are: >>> 1. My matrix is asymmetric, and it is a probabilistic adjacency matrix, >>> whose entries(a_ij) represents the likelihood that user j will broadcast >>> the information generated by user i. Apparently, a_ij and a_ji is >>> different, caus I love you doesn't necessarily mean you love me(What a sad >>> story~). All entries are real. >>> 2. I know I can get eigenvalues through SVD. My problem is I can't get >>> the corresponding eigenvectors, which requires solving equations, and I >>> also need eigenvectors in my calculation.In my simulation of this paper, I >>> only need the biggest eigenvalues and corresponding eigenvectors. >>> The paper posted by Shivaram Venkataraman is also concerned about >>> symmetric matrix. Could any one help me out? >>> >>> >>> 2014-08-08 9:41 GMT+08:00 x <[hidden email] >>> <http://user/SendEmail.jtp?type=node&node=11765&i=2>>: >>> >>> The SVD computed result already contains descending order of singular >>>> values, you can get the biggest eigenvalue. >>>> >>>> --- >>>> >>>> val svd = matrix.computeSVD(matrix.numCols().toInt, computeU = true) >>>> val U: RowMatrix = svd.U >>>> val s: Vector = svd.s >>>> val V: Matrix = svd.V >>>> >>>> U.rows.toArray.take(1).foreach(println) >>>> >>>> println(s.toArray(0)*s.toArray(0)) >>>> >>>> println(V.toArray.take(s.size).foreach(println)) >>>> >>>> --- >>>> >>>> xj @ Tokyo >>>> >>>> >>>> On Fri, Aug 8, 2014 at 3:06 AM, Shivaram Venkataraman <[hidden email] >>>> <http://user/SendEmail.jtp?type=node&node=11765&i=3>> wrote: >>>> >>>>> If you just want to find the top eigenvalue / eigenvector you can do >>>>> something like the Lanczos method. There is a description of a MapReduce >>>>> based algorithm in Section 4.2 of [1] >>>>> >>>>> [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf >>>>> >>>>> >>>>> On Thu, Aug 7, 2014 at 10:54 AM, Li Pu <[hidden email] >>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=4>> wrote: >>>>> >>>>>> @Miles, the latest SVD implementation in mllib is partially >>>>>> distributed. Matrix-vector multiplication is computed among all workers, >>>>>> but the right singular vectors are all stored in the driver. If your >>>>>> symmetric matrix is n x n and you want the first k eigenvalues, you will >>>>>> need to fit n x k doubles in driver's memory. Behind the scene, it calls >>>>>> ARPACK to compute eigen-decomposition of A^T A. You can look into the >>>>>> source code for the details. >>>>>> >>>>>> @Sean, the SVD++ implementation in graphx is not the canonical >>>>>> definition of SVD. It doesn't have the orthogonality that SVD holds. But >>>>>> we >>>>>> might want to use graphx as the underlying matrix representation for >>>>>> mllib.SVD to address the problem of skewed entry distribution. >>>>>> >>>>>> >>>>>> On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks <[hidden email] >>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=5>> wrote: >>>>>> >>>>>>> Reza Zadeh has contributed the distributed implementation of >>>>>>> (Tall/Skinny) SVD ( >>>>>>> http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html), >>>>>>> which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in >>>>>>> Spark >>>>>>> 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your >>>>>>> data is sparse (which it often is in social networks), you may have >>>>>>> better >>>>>>> luck with this. >>>>>>> >>>>>>> I haven't tried the GraphX implementation, but those algorithms are >>>>>>> often well-suited for power-law distributed graphs as you might see in >>>>>>> social networks. >>>>>>> >>>>>>> FWIW, I believe you need to square elements of the sigma matrix from >>>>>>> the SVD to get the eigenvalues. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen <[hidden email] >>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=6>> wrote: >>>>>>> >>>>>>>> (-incubator, +user) >>>>>>>> >>>>>>>> If your matrix is symmetric (and real I presume), and if my linear >>>>>>>> algebra isn't too rusty, then its SVD is its eigendecomposition. The >>>>>>>> SingularValueDecomposition object you get back has U and V, both of >>>>>>>> which have columns that are the eigenvectors. >>>>>>>> >>>>>>>> There are a few SVDs in the Spark code. The one in mllib is not >>>>>>>> distributed (right?) and is probably not an efficient means of >>>>>>>> computing eigenvectors if you really just want a decomposition of a >>>>>>>> symmetric matrix. >>>>>>>> >>>>>>>> The one I see in graphx is distributed? I haven't used it though. >>>>>>>> Maybe it could be part of a solution. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan <[hidden email] >>>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=7>> wrote: >>>>>>>> > Our lab need to do some simulation on online social networks. We >>>>>>>> need to >>>>>>>> > handle a 5000*5000 adjacency matrix, namely, to get its largest >>>>>>>> eigenvalue >>>>>>>> > and corresponding eigenvector. Matlab can be used but it is >>>>>>>> time-consuming. >>>>>>>> > Is Spark effective in linear algebra calculations and >>>>>>>> transformations? Later >>>>>>>> > we would have 5000000*5000000 matrix processed. It seems emergent >>>>>>>> that we >>>>>>>> > should find some distributed computation platform. >>>>>>>> > >>>>>>>> > I see SVD has been implemented and I can get eigenvalues of a >>>>>>>> matrix through >>>>>>>> > this API. But when I want to get both eigenvalues and >>>>>>>> eigenvectors or at >>>>>>>> > least the biggest eigenvalue and the corresponding eigenvector, >>>>>>>> it seems >>>>>>>> > that current Spark doesn't have such API. Is it possible that I >>>>>>>> write >>>>>>>> > eigenvalue decomposition from scratch? What should I do? Thanks a >>>>>>>> lot! >>>>>>>> > >>>>>>>> > >>>>>>>> > Miles Yao >>>>>>>> > >>>>>>>> > ________________________________ >>>>>>>> > View this message in context: How can I implement eigenvalue >>>>>>>> decomposition >>>>>>>> > in Spark? >>>>>>>> > Sent from the Apache Spark User List mailing list archive at >>>>>>>> Nabble.com. >>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: [hidden email] >>>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=8> >>>>>>>> For additional commands, e-mail: [hidden email] >>>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=9> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Li >>>>>> @vrilleup >>>>>> >>>>> >>>>> >>>> >>> >> > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646p11765.html > To start a new topic under Apache Spark User List, email > ml-node+s1001560n1...@n3.nabble.com > To unsubscribe from Apache Spark User List, click here > <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=bGVhcm5pbmdzLmNoaXR0dXJpQGdtYWlsLmNvbXwxfC03NzExMjUwMg==> > . > NAML > <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646p11778.html Sent from the Apache Spark User List mailing list archive at Nabble.com.