date:20190425

Re: [VOTE] Release Apache Spark 2.4.2

2019-04-25 Thread Terry Kim

Very much interested in hearing what you folks decide. We currently have a couple asking us questions at https://github.com/dotnet/spark/issues. Thanks, Terry -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --

Re: FW: Stage 152 contains a task of very large size (12747 KB). The maximum recommended task size is 100 KB

2019-04-25 Thread Russell Spitzer

I usually only see that in regards to folks parallelizing very large objects. From what I know, it's really just the data inside the "Partition" class of the RDD that is being sent back and forth. So usually something like spark.parallelize(Seq(reallyBigMap)) or something like that. The parallelize