date:20190919

Collections passed from driver to executors

2019-09-19 Thread Dhrubajyoti Hati

Hi, I have a question regarding passing a dictionary from driver to executors in spark on yarn. This dictionary is needed in an udf. I am using pyspark. As I understand this can be passed in two ways: 1. Broadcast the variable and then use it in the udfs 2. Pass the dictionary in the udf itself

why spark oom off-heap?

2019-09-19 Thread jib...@qq.com

hello,Why spark usually off-heap oom when shuffle reader? I read some source code , When a ResultTask read shuffle data from no-local executor,it has buffer and spill disk,so why still off-heap oom? jib...@qq.com

Re: Low cache hit ratio when running Spark on Alluxio

2019-09-19 Thread Bin Fan

Depending on the Alluxio version you are running, e..g, for 2.0, the metrics of the local short-circuit read is not turned on by default. So I would suggest you to first turn on the metrics collecting local short-circuit reads by setting alluxio.user.metrics.collection.enabled=true Regarding the g

Re: Can I set the Alluxio WriteType in Spark applications?

2019-09-19 Thread Bin Fan

Hi Mark, You can follow the instructions here: https://docs.alluxio.io/os/user/stable/en/compute/Spark.html#customize-alluxio-user-properties-for-individual-spark-jobs Something like this: $ spark-submit \--conf 'spark.driver.extraJavaOptions=-Dalluxio.user.file.writetype.default=CACHE_THROUGH'

Parquet read performance for different schemas

2019-09-19 Thread Tomas Bartalos

Hello, I have 2 parquets (each containing 1 file): - parquet-wide - schema has 25 top level cols + 1 array - parquet-narrow - schema has 3 top level cols Both files have same data for given columns. When I read from parquet-wide spark reports* read 52.6 KB*, from parquet-narrow *only 2.6 K

Re: [External]Re: spark 2.x design docs

2019-09-19 Thread yeikel valdes

I am also interested. Many of the docs/books that I've seen are practical/examples about usage rather than deep internals of Spark. On Wed, 18 Sep 2019 21:12:12 -1100 vipul.s.p...@gmail.com wrote Yes, I realize what you were looking for, I am also looking for the same docs. Haven

Incorrect results in left_outer join in DSv2 implementation with filter pushdown - spark 2.3.2

2019-09-19 Thread Shubham Chaurasia

Hi, Consider the following statements: 1) > scala> val df = spark.read.format("com.shubham.MyDataSource").load > scala> df.show > +---+---+ > | i| j| > +---+---+ > | 0| 0| > | 1| -1| > | 2| -2| > | 3| -3| > | 4| -4| > +---+---+ > 2) > scala> val df1 = df.filter("i < 3") > scala> df1.show

[no subject]

2019-09-19 Thread Georg Heiler

Hi, How can I create an initial state by hands so that structured streaming files source only reads data which is semantically (i.e. using a file path lexicographically) greater than the minimum committed initial state? Details here: https://stackoverflow.com/questions/58004832/spark-structured-s

Re: [External]Re: spark 2.x design docs

2019-09-19 Thread Vipul Rajan

Yes, I realize what you were looking for, I am also looking for the same docs. Haven't found em yet. Also, jacek laskowski's gitbooks are the next best thing to follow. If you haven't yet. Regards On Thu, Sep 19, 2019 at 12:46 PM wrote: > Thanks Vipul, > > > > I was looking specifically for do

unsubscribe

2019-09-19 Thread Mario Amatucci

RE: [External]Re: spark 2.x design docs

2019-09-19 Thread Kamal7.Kumar

Thanks Vipul, I was looking specifically for documents spark committer use for reference. Currently I’ve put custom logs in spark-core sources then building and running jobs on it. Form printed logs I try to understand execution flows. From: Vipul Rajan Sent: Thursday, September 19, 2019 12:23

Collections passed from driver to executors

why spark oom off-heap?

Re: Low cache hit ratio when running Spark on Alluxio

Re: Can I set the Alluxio WriteType in Spark applications?

Parquet read performance for different schemas

Re: [External]Re: spark 2.x design docs

Incorrect results in left_outer join in DSv2 implementation with filter pushdown - spark 2.3.2

[no subject]

Re: [External]Re: spark 2.x design docs

unsubscribe

RE: [External]Re: spark 2.x design docs

11 matches

Site Navigation

Mail list logo

Footer information