I filled up 2 Jira. 1) Performance when queries nested column https://issues.apache.org/jira/browse/SPARK-16320
2) Pyspark performance https://issues.apache.org/jira/browse/SPARK-16321 I found Jira for: 1) PPD on nested columns https://issues.apache.org/jira/browse/SPARK-5151 2) Drop of support for df.map etc. in Pyspark https://issues.apache.org/jira/browse/SPARK-13594 2016-06-30 0:47 GMT+02:00 Michael Allman <mich...@videoamp.com>: > The patch we use in production is for 1.5. We're porting the patch to master > (and downstream to 2.0, which is presently very similar) with the intention > of submitting a PR "soon". We'll push it here when it's ready: > https://github.com/VideoAmp/spark-public. > > Regarding benchmarking, we have a suite of Spark SQL regression tests which > we run to check correctness and performance. I can share our findings when I > have them. > > Cheers, > > Michael > >> On Jun 29, 2016, at 2:39 PM, Maciej Bryński <mac...@brynski.pl> wrote: >> >> 2016-06-29 23:22 GMT+02:00 Michael Allman <mich...@videoamp.com>: >>> I'm sorry I don't have any concrete advice for you, but I hope this helps >>> shed some light on the current support in Spark for projection pushdown. >>> >>> Michael >> >> Michael, >> Thanks for the answer. This resolves one of my questions. >> Which Spark version you have patched ? 1.6 ? Are you planning to >> public this patch or just for 2.0 branch ? >> >> I gladly help with some benchmark in my environment. >> >> Regards, >> -- >> Maciek Bryński > -- Maciek Bryński --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org