from:"Pavel Plotnikov"

Re: Spark on Mesos - Weird behavior

2018-07-11 Thread Pavel Plotnikov

c > allocation. Am I wrong? > > -Thodoris > > On 11 Jul 2018, at 17:09, Pavel Plotnikov > wrote: > > Hi, Thodoris > You can configure resources per executor and manipulate with number of > executers instead using spark.max.cores. I think

Re: Spark on Mesos - Weird behavior

2018-07-11 Thread Pavel Plotnikov

that seems that we can’t control the resource usage of an application. By > the way, we are not using dynamic allocation. > > - Thodoris > > > On 10 Jul 2018, at 14:35, Pavel Plotnikov > wrote: > > Hello Thodoris! > Have you checked this: > - does mesos cluster ha

Re: Spark on Mesos - Weird behavior

2018-07-10 Thread Pavel Plotnikov

Hello Thodoris! Have you checked this: - does mesos cluster have available resources? - if spark have waiting tasks in queue more than spark.dynamicAllocation.schedulerBacklogTimeout configuration value? - And then, have you checked that mesos send offers to spark app mesos framework at least w

Re: Spark diclines mesos offers

2017-04-26 Thread Pavel Plotnikov

ler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala#L316 > > On Mon, Apr 24, 2017 at 4:53 AM, Pavel Plotnikov < > pavel.plotni...@team.wrike.com> wrote: > >> Hi, everyone! I run spark 2.1.0 jobs on the top of Mesos cluster in >> coarse-grained mode with dynamic

Spark diclines mesos offers

2017-04-24 Thread Pavel Plotnikov

Hi, everyone! I run spark 2.1.0 jobs on the top of Mesos cluster in coarse-grained mode with dynamic resource allocation. And sometimes spark mesos scheduler declines mesos offers despite the fact that not all available resources were used (I have less workers than the possible maximum) and the max

Re: Spark runs out of memory with small file

2017-02-26 Thread Pavel Plotnikov

Hi, Henry In first example the dict d always contains only one value because the_Id is same, in second case duct grows very quickly. So, I can suggest to firstly apply map function to split you file with string on rows then please make repartition and then apply custom logic Example: def splitf(

Re: Launching an Spark application in a subset of machines

2017-02-07 Thread Pavel Plotnikov

Hi, Alvaro You can create different clusters using standalone cluster manager, and than manage subset of machines through submitting application on different masters. Or you can use Mesos attributes to mark subset of workers and specify it in spark.mesos.constraints On Tue, Feb 7, 2017 at 1:21 PM

Re: physical memory usage keep increasing for spark app on Yarn

2017-01-23 Thread Pavel Plotnikov

th) and then dropDF.repartition(1).write.mode(SaveMode.ErrorIfExists).parquet(targetpath) Best, On Sun, Jan 22, 2017 at 12:31 PM Yang Cao wrote: > Also, do you know why this happen? > > On 2017年1月20日, at 18:23, Pavel Plotnikov > wrote: > > Hi Yang, > i have faced wi

Re: physical memory usage keep increasing for spark app on Yarn

2017-01-20 Thread Pavel Plotnikov

Hi Yang, i have faced with the same problem on Mesos and to circumvent this issue i am usually increase partition number. On last step in your code you reduce number of partitions to 1, try to set bigger value, may be it solve this problem. Cheers, Pavel On Fri, Jan 20, 2017 at 12:35 PM Yang Cao

Re: Spark partition size tuning

2016-01-26 Thread Pavel Plotnikov

Hi, May be *sc.hadoopConfiguration.setInt( "dfs.blocksize", blockSize ) *helps you Best Regards, Pavel On Tue, Jan 26, 2016 at 7:13 AM Jia Zou wrote: > Dear all, > > First to update that the local file system data partition size can be > tuned by: > sc.hadoopConfiguration().setLong("fs.local.bl

Re: Parquet write optimization by row group size config

2016-01-21 Thread Pavel Plotnikov

21, 2016 at 10:35 AM Jörn Franke wrote: > What is your data size, the algorithm and the expected time? > Depending on this the group can recommend you optimizations or tell you > that the expectations are wrong > > On 20 Jan 2016, at 18:24, Pavel Plotnikov > wrote: > > Than

Re: Parquet write optimization by row group size config

2016-01-20 Thread Pavel Plotnikov

Thanks, Akhil! It helps, but this jobs still not fast enough, maybe i missed something Regards, Pavel On Wed, Jan 20, 2016 at 9:51 AM Akhil Das wrote: > Did you try re-partitioning the data before doing the write? > > Thanks > Best Regards > > On Tue, Jan 19, 2016 at 6:13 P

Re: Can I configure Spark on multiple nodes using local filesystem on each node?

2016-01-19 Thread Pavel Plotnikov

Hi, I'm using Spark in standalone mode without HDFS, and shared folder is mounted on nodes via nfs. It looks like each node write data like in local file system. Regards, Pavel On Tue, Jan 19, 2016 at 5:39 PM Jia Zou wrote: > Dear all, > > Can I configure Spark on multiple nodes without HDFS,

Parquet write optimization by row group size config

2016-01-19 Thread Pavel Plotnikov

Hello, I'm using spark on some machines in standalone mode, data storage is mounted on this machines via nfs. A have input data stream and when i'm trying to store all data for hour in parquet, a job executes mostly on one core and this hourly data are stored in 40- 50 minutes. It is very slow! And

Re: Spark on Mesos - Weird behavior

Re: Spark on Mesos - Weird behavior

Re: Spark on Mesos - Weird behavior

Re: Spark diclines mesos offers

Spark diclines mesos offers

Re: Spark runs out of memory with small file

Re: Launching an Spark application in a subset of machines

Re: physical memory usage keep increasing for spark app on Yarn

Re: physical memory usage keep increasing for spark app on Yarn

Re: Spark partition size tuning

Re: Parquet write optimization by row group size config

Re: Parquet write optimization by row group size config

Re: Can I configure Spark on multiple nodes using local filesystem on each node?

Parquet write optimization by row group size config

14 matches

Site Navigation

Mail list logo

Footer information