Hi,
According to my understanding, contents in df.cache() is currently on Java 
heap as a set of Byte arrays in 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala#L58
. Data is accessed by using sun.misc.unsafe APIs. Data maybe compressed 
sometime.
CachedBatch is private, and this representation may be changed in the 
future.

In general, It is not easy to access this data by using C/C++ API.

Regards,
Kazuaki Ishizaki



From:   Jacek Laskowski <ja...@japila.pl>
To:     "jpivar...@gmail.com" <jpivar...@gmail.com>
Cc:     dev <dev@spark.apache.org>
Date:   2016/05/29 08:18
Subject:        Re: How to access the off-heap representation of cached 
data in Spark 2.0



Hi Jim,

There's no C++ API in Spark to access the off-heap data. Moreover, I
also think "off-heap" has an overloaded meaning in Spark - for
tungsten and to persist your data off-heap (it's all about memory but
for different purposes and with client- and internal API).

That's my limited understanding of the things (and I'm not even sure
how trustworthy it is). Use with extreme caution.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Sat, May 28, 2016 at 5:29 PM, jpivar...@gmail.com
<jpivar...@gmail.com> wrote:
> Is this not the place to ask such questions? Where can I get a hint as 
to how
> to access the new off-heap cache, or C++ API, if it exists? I'm willing 
to
> do my own research, but I have to have a place to start. (In fact, this 
is
> the first step in that research.)
>
> Thanks,
> -- Jim
>
>
>
>
> --
> View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-access-the-off-heap-representation-of-cached-data-in-Spark-2-0-tp17701p17717.html

> Sent from the Apache Spark Developers List mailing list archive at 
Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org




Reply via email to