Hi Tarek,
Thanks for your interest & for checking the guidelines first! On 2 points:
Algorithm: PCA is of course a critical algorithm. The main question is how
your algorithm/implementation differs from the current PCA. If it's
different and potentially better, I'd recommend opening up a JIRA
On Fri, May 8, 2015 at 4:16 AM, Steve Loughran
wrote:
> Would there be a place in the code tree for some tests to run against
> things like this? They're cloud integration tests rather than unit tests
> and nobody would want them to be on by default, but it could be good for
> regression testing
Hi folks,
wanted to get a sanity check before opening a JIRA. I am trying to do the
following:
create a HiveContext, then from different threads:
1. Create a DataFrame
2. Name said df via registerTempTable
3. do a simple query via sql and dropTempTable
My understanding is that since HiveContext
I just noticed I sent this to users instead of dev:
-- Forwarded message --
From: Fernando O.
Date: Sat, May 16, 2015 at 4:09 PM
Subject: Problem building master on 2.11
To: "u...@spark.apache.org"
Is anyone else having issues when building spark from git?
I created a jira ticke
Hi Patrick,
Thank you very much for your response. I am almost there, but am not sure
about my conclusion. Let me try to approach it from a different angle.
I would like to time the impact of a particular lambda function, or if
possible, more broadly measure the the impact of any map function. I
Hi,
I would like to contribute an algorithm to the MLlib project. I have
implemented a scalable PCA algorithm on spark. It is scalable for both tall
and fat matrices and the paper around it is accepted for publication in
SIGMOD 2015 conference. I looked at the guidelines in the following link:
ht