quot;
Sent: Monday, December 7, 2015 10:57 AM
Subject: Re: Shared memory between C++ process and Spark
Annabel
Spark works very well with data stored in HDFS but is certainly not tied to it.
Have a look at the wide variety of connectors to things like Cassandra, HBase,
etc.
Robin
Sent fro
I’m not sure what point you’re trying to prove and I’m not particularly
interested in getting into a protracted discussion. Here is what you wrote: The
architecture of Spark is to run on top of HDFS. I interpreted that as a
statement implying that Spark has to run on HDFS which is definitely not
Hi Annabel
I certainly did read your post. My point was that Spark can read from HDFS but
is in no way tied to that storage layer . A very interesting use case that
sounds very similar to Jia's (as mentioned by another poster) is contained in
https://issues.apache.org/jira/browse/SPARK-10399. T
Annabel
Spark works very well with data stored in HDFS but is certainly not tied to it.
Have a look at the wide variety of connectors to things like Cassandra, HBase,
etc.
Robin
Sent from my iPhone
> On 7 Dec 2015, at 18:50, Annabel Melongo wrote:
>
> Jia,
>
> I'm so confused on this. The
Thanks, Annabel, but I may need to clarify that I have no intention to write
and run Spark UDF in C++, I'm just wondering whether Spark can read and write
data to a C++ process with zero copy.
Best Regards,
Jia
On Dec 7, 2015, at 12:26 PM, Annabel Melongo wrote:
> My guess is that Jia want
dev@spark.apache.org, Robin
> East
> Date: 2015/12/08 03:17
> Subject: Re: Shared memory between C++ process and Spark
>
>
>
> Thanks, Dewful!
>
> My impression is that Tachyon is a very nice in-memory file system that can
> connect to multiple sto
My guess is that Jia wants to run C++ on top of Spark. If that's the case, I'm
afraid this is not possible. Spark has support for Java, Python, Scala and R.
The best way to achieve this is to run your application in C++ and used the
data created by said application to do manipulation within Spark
Is this JIRA entry related to what you want?
https://issues.apache.org/jira/browse/SPARK-10399
Regards,
Kazuaki Ishizaki
From: Jia
To: Dewful
Cc: "user @spark" , dev@spark.apache.org, Robin
East
Date: 2015/12/08 03:17
Subject: Re: Shared memory between C++ p
Thanks, Dewful!
My impression is that Tachyon is a very nice in-memory file system that can
connect to multiple storages.
However, because our data is also hold in memory, I suspect that connecting to
Spark directly may be more efficient in performance.
But definitely I need to look at Tachyon m
Maybe looking into something like Tachyon would help, I see some sample c++
bindings, not sure how much of the current functionality they support...
Hi, Robin,
Thanks for your reply and thanks for copying my question to user mailing
list.
Yes, we have a distributed C++ application, that will store
Hi, Robin,
Thanks for your reply and thanks for copying my question to user mailing list.
Yes, we have a distributed C++ application, that will store data on each node
in the cluster, and we hope to leverage Spark to do more fancy analytics on
those data. But we need high performance, that’s why
-dev, +user (this is not a question about development of Spark itself so you’ll
get more answers in the user mailing list)
First up let me say that I don’t really know how this could be done - I’m sure
it would be possible with enough tinkering but it’s not clear what you are
trying to achieve.
Dears, for one project, I need to implement something so Spark can read data
from a C++ process.
To provide high performance, I really hope to implement this through shared
memory between the C++ process and Java JVM process.
It seems it may be possible to use named memory mapped files and JNI t
13 matches
Mail list logo