from:"ameyc"

Re: PySpark failing on a mid-sized broadcast

2015-11-30 Thread ameyc

BTW, my spark.python.worker.reuse setting is set to "true". -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-failing-on-a-mid-sized-broadcast-tp25520p25521.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

PySpark failing on a mid-sized broadcast

2015-11-30 Thread ameyc

So I'm running PySpark 1.3.1 on Amazon EMR on a fairly beefy cluster (20 node cluster with 32 cores each node and 64 gig memory) and my parallelism, executor.instances, executor.cores and executor memory settings are also fairly reasonable (600, 20, 30, 48gigs). However my job invariably fails whe

Alternatives to groupByKey

2014-12-03 Thread ameyc

Hi, So my Spark app needs to run a sliding window through a time series dataset (I'm not using Spark streaming). And then run different types on aggregations on per window basis. Right now I'm using a groupByKey() which gives me Iterables for each window. There are a few concerns I have with this

hadoop_conf_dir when running spark on yarn

2014-10-31 Thread ameyc

How do i setup hadoop_conf_dir correctly when I'm running my spark job on Yarn? My Yarn environment has the correct hadoop_conf_dir settings by the configuration that I pull from sc.hadoopConfiguration() is incorrect. -- View this message in context: http://apache-spark-user-list.1001560.n3.nab

Spark streaming on data at rest.

2014-10-16 Thread ameyc

Apologies if this is something very obvious but I've perused the spark streaming guide and this isn't very evident to me still. So I have files with data of the format: timestamp,column1,column2,column3.. etc. and I'd like to use the spark streaming's window operations on them. However from what I

Re: PySpark failing on a mid-sized broadcast

PySpark failing on a mid-sized broadcast

Alternatives to groupByKey

hadoop_conf_dir when running spark on yarn

Spark streaming on data at rest.

5 matches

Site Navigation

Mail list logo

Footer information