Re: How can I implement eigenvalue decomposition in Spark?

Chitturi Padma Fri, 08 Aug 2014 01:52:31 -0700

Hi,

I have similar problem. I need matrix operations such as dot product ,
cross product , transpose, matrix multiplication to be performed on Spark.
Does spark has inbuilt API to support these?
I see matrix factorization implementation in mlib.



On Fri, Aug 8, 2014 at 12:38 PM, yaochunnan [via Apache Spark User List] <
ml-node+s1001560n11765...@n3.nabble.com> wrote:

> I think the eigenvalues and eigenvectors you are talking about is that of
> M^T*M or M*M^T, if we get M=U*s*V^T as SVD. What I want is to get
> eigenvectors and eigenvalues of M itself. Is this my misunderstanding of
> linear algebra or the API?
>
> [image: M^{*} M = V \Sigma^{*} U^{*}\, U \Sigma V^{*} = V (\Sigma^{*}
> \Sigma) V^{*}\,] [image: M M^{*} = U \Sigma V^{*} \, V \Sigma^{*} U^{*} =
> U (\Sigma \Sigma^{*}) U^{*}\,]
>
>
>
> 2014-08-08 11:19 GMT+08:00 x <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=11765&i=0>>:
>
> U.rows.toArray.take(1).foreach(println) and 
> V.toArray.take(s.size).foreach(println)
>> are not eigenvectors corresponding to the biggest eigenvalue
>> s.toArray(0)*s.toArray(0)?
>>
>> xj @ Tokyo
>>
>>
>> On Fri, Aug 8, 2014 at 12:07 PM, Chunnan Yao <[hidden email]
>> <http://user/SendEmail.jtp?type=node&node=11765&i=1>> wrote:
>>
>>> Hi there, what you've suggested are all meaningful. But to make myself
>>> clearer, my essential problems are:
>>> 1. My matrix is asymmetric, and it is a probabilistic adjacency matrix,
>>> whose entries(a_ij) represents the likelihood that user j will broadcast
>>> the information generated by user i. Apparently, a_ij and a_ji is
>>> different, caus I love you doesn't necessarily mean you love me(What a sad
>>> story~). All entries are real.
>>> 2. I know I can get eigenvalues through SVD. My problem is I can't get
>>> the corresponding eigenvectors, which requires solving equations, and I
>>> also need eigenvectors in my calculation.In my simulation of this paper, I
>>> only need the biggest eigenvalues and corresponding eigenvectors.
>>> The paper posted by Shivaram Venkataraman is also concerned about
>>> symmetric matrix. Could any one help me out?
>>>
>>>
>>> 2014-08-08 9:41 GMT+08:00 x <[hidden email]
>>> <http://user/SendEmail.jtp?type=node&node=11765&i=2>>:
>>>
>>>  The SVD computed result already contains descending order of singular
>>>> values, you can get the biggest eigenvalue.
>>>>
>>>> ---
>>>>
>>>>   val svd = matrix.computeSVD(matrix.numCols().toInt, computeU = true)
>>>>   val U: RowMatrix = svd.U
>>>>   val s: Vector = svd.s
>>>>   val V: Matrix = svd.V
>>>>
>>>>   U.rows.toArray.take(1).foreach(println)
>>>>
>>>>   println(s.toArray(0)*s.toArray(0))
>>>>
>>>>   println(V.toArray.take(s.size).foreach(println))
>>>>
>>>> ---
>>>>
>>>> xj @ Tokyo
>>>>
>>>>
>>>> On Fri, Aug 8, 2014 at 3:06 AM, Shivaram Venkataraman <[hidden email]
>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=3>> wrote:
>>>>
>>>>> If you just want to find the top eigenvalue / eigenvector you can do
>>>>> something like the Lanczos method. There is a description of a MapReduce
>>>>> based algorithm in Section 4.2 of [1]
>>>>>
>>>>> [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf
>>>>>
>>>>>
>>>>> On Thu, Aug 7, 2014 at 10:54 AM, Li Pu <[hidden email]
>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=4>> wrote:
>>>>>
>>>>>> @Miles, the latest SVD implementation in mllib is partially
>>>>>> distributed. Matrix-vector multiplication is computed among all workers,
>>>>>> but the right singular vectors are all stored in the driver. If your
>>>>>> symmetric matrix is n x n and you want the first k eigenvalues, you will
>>>>>> need to fit n x k doubles in driver's memory. Behind the scene, it calls
>>>>>> ARPACK to compute eigen-decomposition of A^T A. You can look into the
>>>>>> source code for the details.
>>>>>>
>>>>>> @Sean, the SVD++ implementation in graphx is not the canonical
>>>>>> definition of SVD. It doesn't have the orthogonality that SVD holds. But 
>>>>>> we
>>>>>> might want to use graphx as the underlying matrix representation for
>>>>>> mllib.SVD to address the problem of skewed entry distribution.
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks <[hidden email]
>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=5>> wrote:
>>>>>>
>>>>>>> Reza Zadeh has contributed the distributed implementation of
>>>>>>> (Tall/Skinny) SVD (
>>>>>>> http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html),
>>>>>>> which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in 
>>>>>>> Spark
>>>>>>> 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your
>>>>>>> data is sparse (which it often is in social networks), you may have 
>>>>>>> better
>>>>>>> luck with this.
>>>>>>>
>>>>>>> I haven't tried the GraphX implementation, but those algorithms are
>>>>>>> often well-suited for power-law distributed graphs as you might see in
>>>>>>> social networks.
>>>>>>>
>>>>>>> FWIW, I believe you need to square elements of the sigma matrix from
>>>>>>> the SVD to get the eigenvalues.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen <[hidden email]
>>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=6>> wrote:
>>>>>>>
>>>>>>>> (-incubator, +user)
>>>>>>>>
>>>>>>>> If your matrix is symmetric (and real I presume), and if my linear
>>>>>>>> algebra isn't too rusty, then its SVD is its eigendecomposition. The
>>>>>>>> SingularValueDecomposition object you get back has U and V, both of
>>>>>>>> which have columns that are the eigenvectors.
>>>>>>>>
>>>>>>>> There are a few SVDs in the Spark code. The one in mllib is not
>>>>>>>> distributed (right?) and is probably not an efficient means of
>>>>>>>> computing eigenvectors if you really just want a decomposition of a
>>>>>>>> symmetric matrix.
>>>>>>>>
>>>>>>>> The one I see in graphx is distributed? I haven't used it though.
>>>>>>>> Maybe it could be part of a solution.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan <[hidden email]
>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=7>> wrote:
>>>>>>>> > Our lab need to do some simulation on online social networks. We
>>>>>>>> need to
>>>>>>>> > handle a 5000*5000 adjacency matrix, namely, to get its largest
>>>>>>>> eigenvalue
>>>>>>>> > and corresponding eigenvector. Matlab can be used but it is
>>>>>>>> time-consuming.
>>>>>>>> > Is Spark effective in linear algebra calculations and
>>>>>>>> transformations? Later
>>>>>>>> > we would have 5000000*5000000 matrix processed. It seems emergent
>>>>>>>> that we
>>>>>>>> > should find some distributed computation platform.
>>>>>>>> >
>>>>>>>> > I see SVD has been implemented and I can get eigenvalues of a
>>>>>>>> matrix through
>>>>>>>> > this API.  But when I want to get both eigenvalues and
>>>>>>>> eigenvectors or at
>>>>>>>> > least the biggest eigenvalue and the corresponding eigenvector,
>>>>>>>> it seems
>>>>>>>> > that current Spark doesn't have such API. Is it possible that I
>>>>>>>> write
>>>>>>>> > eigenvalue decomposition from scratch? What should I do? Thanks a
>>>>>>>> lot!
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Miles Yao
>>>>>>>> >
>>>>>>>> > ________________________________
>>>>>>>> > View this message in context: How can I implement eigenvalue
>>>>>>>> decomposition
>>>>>>>> > in Spark?
>>>>>>>> > Sent from the Apache Spark User List mailing list archive at
>>>>>>>> Nabble.com.
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: [hidden email]
>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=8>
>>>>>>>> For additional commands, e-mail: [hidden email]
>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11765&i=9>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Li
>>>>>> @vrilleup
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646p11765.html
>  To start a new topic under Apache Spark User List, email
> ml-node+s1001560n1...@n3.nabble.com
> To unsubscribe from Apache Spark User List, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=bGVhcm5pbmdzLmNoaXR0dXJpQGdtYWlsLmNvbXwxfC03NzExMjUwMg==>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-can-I-implement-eigenvalue-decomposition-in-Spark-tp11646p11778.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How can I implement eigenvalue decomposition in Spark?

Reply via email to