I'm using livy-0.5.0 with spark2.3.0,I started a session with 4GB mem for
Driver, And I run code server times :
var tmp1 = spark.sql("use tpcds_bin_partitioned_orc_2");var tmp2 =
spark.sql("select count(1) from tpcds_bin_partitioned_orc_2.store_sales").show
the table have 5760749 rows data.
af
Can you use JNI to call the c++ functionality directly from Java?
Or you wrap this into a MR step outside Spark and use Hadoop Streaming (it
allows you to use shell scripts as mapper and reducer)?
You can also write temporary files for each partition and execute the software
within a map step.
Hello,
You could try using mapPartitions function if you can send partial data
to your C++ program:
mapPartitions(func):
Similar to map, but runs separately on each partition (block) of the
RDD, so /func/ must be of type Iterator => Iterator when running
on an RDD of type T.
That way you ca
It is intentionally not accessible in your code since Utils is internal
Spark code, not part of the public API. Changing Spark to make that private
code public would be inviting trouble, or at least future headaches. If you
don't already know how to build and maintain your own custom fork of Spark
I have a problem where a critical step needs to be performed by a third
party c++ application. I can send or install this program on the worker
nodes. I can construct a function holding all the data this program needs
to process. The problem is that the program is designed to read and write
from
Hi,
I want to use org.apache.spark.util.Utils library in def main but I got the
error:
Symbole Util is not accessible from this place. Here is the code:
val temp = tokens.map(word => Utils.nonNegativeMod(x, y))
How can I make it accessible?