Hi, I hope this answers your question.
You can hint the broadcast in SQL as detailed here:
https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-joins-broadcast.html
(thanks
Jacek :) )
I'd recommend creating a temporary table with the trimming you use in the
join (for clarity). Also kee
Hi,
We were running Logistic Regression in Spark 2.2.X and then we tried to see
how does it do in Spark 2.3.X. Now we are facing an issue while running a
Logistic Regression Model in Spark 2.3.X on top of Yarn(GCP-Dataproc). In
the TreeAggregate method it takes a huge time due to very High GC Acti
Hi all:
i want to ask a question about broadcast join in spark sql.
```
select A.*,B.nsf_cards_ratio * 1.00 / A.nsf_on_entry as nsf_ratio_to_pop
from B
left join A
on trim(A.country) = trim(B.cntry_code);
```
here A is a small table only 8 rows, but somehow the statistics of table A has
@Sean Owen Thank you very much. And I saw your reply
comment in https://issues.apache.org/jira/browse/SPARK-28519, I will test
with modification and to see whether there are other similar tests fail,
and will address them together in one pull request.
On Sat, Jul 27, 2019 at 9:04 PM Sean Owen w