I'd also like to get Wiki write access - at the least it allows a few of us
to amend the "Powered By" and similar pages when those requests come
through (Sean has been doing a lot of that recently :)
On Mon, Jan 11, 2016 at 11:01 PM, Sean Owen wrote:
> ... I forget who can give access -- is it I
Hi all,
I'd like to share news of the recent release of a new Spark package,
[ROSE](http://spark-packages.org/package/onetapbeyond/opencpu-spark-executor).
ROSE is a Scala library offering access to the full scientific computing power
of the R programming language to Apache Spark batch and stre
Hi all, I've been experimenting with DataFrame operations in a mixed
endian environment - a big endian master with little endian workers. With
tungsten enabled I'm encountering data corruption issues.
For example, with this simple test code:
import org.apache.spark.SparkContext
import org.apach
I logged SPARK-12778 where endian awareness in Platform.java should
help in mixed
endian set up.
There could be other parts of the code base which are related.
Cheers
On Tue, Jan 12, 2016 at 7:01 AM, Adam Roberts wrote:
> Hi all, I've been experimenting with DataFrame operations in a mixed
> e
David,
Thank you very much for announcing this! It looks like it could be very
useful. Would you mind providing a link to the github?
On Tue, Jan 12, 2016 at 10:03 AM, David
wrote:
> Hi all,
>
> I'd like to share news of the recent release of a new Spark package, ROSE.
>
>
> ROSE is a Scala lib
Hi Corey,
> Would you mind providing a link to the github?
Sure, here is the github link you're looking for:
https://github.com/onetapbeyond/opencpu-spark-executor
David
"All that is gold does not glitter, Not all those who wander are lost."
Original Message
Subject: Re: R
> Ok, sounds good. I think it would be great, if you could add installing the
> 'docker-engine' package and starting the 'docker' service in there too. I
> was planning to update the playbook if there were one in the apache/spark
> repo but I didn't see one, hence my question.
>
we currently have d
Hi,
I wanted to know if there are any implementations yet within the Machine
Learning Library or generally that can efficiently solve eigenvalue problems?
Or if not do you have suggestions on how to approach a parallel execution maybe
with BLAS or Breeze?
Thanks in advance!
Lydia
Von meinem i
How big of a deal this use case is in a heterogeneous endianness
environment? If we do want to fix it, we should do it when right before
Spark shuffles data to minimize performance penalty, i.e. turn big-endian
encoded data into little-indian encoded data before it goes on the wire.
This is a prett
Ran into this need myself. Does Spark have an equivalent of "mapreduce.
input.fileinputformat.list-status.num-threads"?
Thanks.
On Thu, Jul 23, 2015 at 8:50 PM, Cheolsoo Park wrote:
> Hi,
>
> I am wondering if anyone has successfully enabled
> "mapreduce.input.fileinputformat.list-status.num-t
Hi,
I'm putting together a Spark package (in the spark-packages.org sense)
and I'd like to make use of the class
org.apache.spark.mllib.util.TestingUtils which appears in
mllib/src/test. Can I declare a dependency in my build.sbt to pull in
a suitable jar? I have searched around but I have not bee
There is no annotation in TestingUtils class indicating whether it is
suitable for consumption by external projects.
You should assume the class is not public since its methods may change in
future Spark releases.
Cheers
On Tue, Jan 12, 2016 at 12:36 PM, Robert Dodier
wrote:
> Hi,
>
> I'm putt
If you need it, just copy it over to your own package. That's probably the
safest option.
On Tue, Jan 12, 2016 at 12:50 PM, Ted Yu wrote:
> There is no annotation in TestingUtils class indicating whether it is
> suitable for consumption by external projects.
>
> You should assume the class is
On 12 Jan 2016, at 10:49, Reynold Xin
mailto:r...@databricks.com>> wrote:
How big of a deal this use case is in a heterogeneous endianness environment?
If we do want to fix it, we should do it when right before Spark shuffles data
to minimize performance penalty, i.e. turn big-endian encoded d
(I don't know anything spark specific, so I'm going to treat it like a
Breeze question...)
As I understand it, Spark uses ARPACK via Breeze for SVD, and presumably
the same approach can be used for EVD. Basically, you make a function that
multiplies your "matrix" (which might be represented
implic
On Tue, Jan 12, 2016 at 12:55 PM, Reynold Xin wrote:
> If you need it, just copy it over to your own package. That's probably the
> safest option.
OK, not a big deal, I was just hoping to avoid that, in part because the stuff
I'm working on is also proposed as a pull request, and it seems like i
(x86 is little-endian and SPARC / POWER / ARM are big-endian; I'm sure
that was just a typo)
On Tue, Jan 12, 2016 at 9:13 PM, Steve Loughran wrote:
> It's notable that Hadoop doesn't like mixed-endianness; there is work
> (primarily from Oracle) to have consistent byteswapping —that is: work
> re
On 12 Jan 2016, at 10:49, Reynold Xin
mailto:r...@databricks.com>> wrote:
How big of a deal this use case is in a heterogeneous endianness environment?
If we do want to fix it, we should do it when right before Spark shuffles data
to minimize performance penalty, i.e. turn big-endian encoded d
I was looking into the Apache export control page, and didn't see Spark
listed there, which from my initial investigation seemed ok because i
couldn't find any handling of cryptography in Spark code.
Could someone more familiar with the Spark dependency hierarchy confirm
that there is no specific
FWIW, POWER is bi-endian. AIX still runs big-endian on POWER, but the
latest Linux distros for POWER run little-endian (in fact Ubuntu for POWER
only runs LE).
> (x86 is little-endian and SPARC / POWER / ARM are big-endian; I'm sure
> that was just a typo)
>
> On Tue, Jan 12, 2016 at 9:13
Alex, see this jira-
https://issues.apache.org/jira/browse/SPARK-9926
On Tue, Jan 12, 2016 at 10:55 AM, Alex Nastetsky <
alex.nastet...@vervemobile.com> wrote:
> Ran into this need myself. Does Spark have an equivalent of "mapreduce.
> input.fileinputformat.list-status.num-threads"?
>
> Thanks.
Thanks. I was actually able to get mapreduce.input.
fileinputformat.list-status.num-threads working in Spark against a regular
fileset in S3, in Spark 1.5.2 ... looks like the issue is isolated to Hive.
On Tue, Jan 12, 2016 at 6:48 PM, Cheolsoo Park wrote:
> Alex, see this jira-
> https://issues
Hi Richard,
> Would it be possible to access the session API from within ROSE,
> to get for example the images that are generated by R / openCPU
Technically it would be possible although there would be some potentially
significant runtime costs per task in doing so, primarily those related to
e
23 matches
Mail list logo