Re: org.apache.spark.shuffle.FetchFailedException: Too large frame:

2018-05-03 Thread Ryan Blue
Yes, you can usually use a broadcast join to avoid skew problems. On Wed, May 2, 2018 at 8:57 PM, Pralabh Kumar wrote: > I am performing join operation , if I convert reduce side join to map side > (no shuffle will happen) and I assume in that case this error shouldn't > come. Let me know if th

Re: org.apache.spark.shuffle.FetchFailedException: Too large frame:

2018-05-02 Thread Pralabh Kumar
I am performing join operation , if I convert reduce side join to map side (no shuffle will happen) and I assume in that case this error shouldn't come. Let me know if this understanding is correct On Tue, May 1, 2018 at 9:37 PM, Ryan Blue wrote: > This is usually caused by skew. Sometimes you

Re: org.apache.spark.shuffle.FetchFailedException: Too large frame:

2018-05-01 Thread Ryan Blue
This is usually caused by skew. Sometimes you can work around it by in creasing the number of partitions like you tried, but when that doesn’t work you need to change the partitioning that you’re using. If you’re aggregating, try adding an intermediate aggregation. For example, if your query is se