I'm using spark-1.4.0. Sure will try to make steps to reproduce and file
a JIRA ticket.
Thanks,
Peter Rudenko
On 2015-06-26 11:14, Josh Rosen wrote:
Which Spark version are you using? Can you file a JIRA for this issue?
On Thu, Jun 25, 2015 at 6:35 AM, Peter Rudenko
mailto:petro.rude...@gma
Which Spark version are you using? Can you file a JIRA for this issue?
On Thu, Jun 25, 2015 at 6:35 AM, Peter Rudenko
wrote:
> Hi, i have a small but very wide dataset (2000 columns). Trying to
> optimize Dataframe pipeline for it, since it behaves very poorly comparing
> to rdd operation.
> W
Hi, i have a small but very wide dataset (2000 columns). Trying to
optimize Dataframe pipeline for it, since it behaves very poorly
comparing to rdd operation.
With spark.sql.codegen=true it throws StackOverflow:
15/06/25 16:27:16 INFO CacheManager: Partition rdd_12_3 not found,
computing it 1