Re: [Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen

2020-02-26 Thread Liu Genie
e_...@outlook.com<mailto:genie_...@outlook.com> mailto:genie_...@outlook.com>> Subject: Re: [Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen Can you please explain what you mean with that? How do you use a udf to replace a join? Thanks O

Re: [Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen

2020-02-25 Thread Jianneng Li
an't do all joins this way. Best, Jianneng From: yeikel valdes Sent: Tuesday, February 25, 2020 5:48 AM To: Jianneng Li Cc: user@spark.apache.org ; genie_...@outlook.com Subject: Re: [Spark SQL] Memory problems with packing too many joins into the same W

Re: [Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen

2020-02-25 Thread yeikel valdes
From: Liu Genie Sent: Monday, February 24, 2020 6:39 PM To: user@spark.apache.org Subject: Re: [Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen   I have encountered too many joins problem before. Since the joined dataframe is small enough, I convert j

Re: [Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen

2020-02-24 Thread Jianneng Li
Thanks Genie. Unfortunately, the joins I'm doing in this case are large, so UDF likely won't work. Jianneng From: Liu Genie Sent: Monday, February 24, 2020 6:39 PM To: user@spark.apache.org Subject: Re: [Spark SQL] Memory problems with packing too

Re: [Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen

2020-02-24 Thread Liu Genie
I have encountered too many joins problem before. Since the joined dataframe is small enough, I convert join to udf operation, which is much faster and didn’t generate out of memory problem. 2020年2月25日 10:15,Jianneng Li mailto:jianneng...@workday.com>> 写道: Hello everyone, WholeStageCodegen ge

[Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen

2020-02-24 Thread Jianneng Li
Hello everyone, WholeStageCodegen generates code that appends results into a BufferedRowIterator, which keeps the results in an in-memory linked list