Re: Cartesian join on RDDs taking too much time

Mich Talebzadeh Wed, 25 May 2016 00:21:06 -0700

It is basically a Cartesian join like RDBMS

Example:

SELECT * FROM FinancialCodes,  FinancialData

The results of this query matches every row in the FinancialCodes table
with every row in the FinancialData table.  Each row consists of all
columns from the FinancialCodes table followed by all columns from the
FinancialData table.

Not very useful

Dr Mich Talebzadeh

LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

http://talebzadehmich.wordpress.com

On 25 May 2016 at 08:05, Priya Ch <learnings.chitt...@gmail.com> wrote:

> Hi All,
>
>   I have two RDDs A and B where in A is of size 30 MB and B is of size 7
> MB, A.cartesian(B) is taking too much time. Is there any bottleneck in
> cartesian operation ?
>
> I am using spark 1.6.0 version
>
> Regards,
> Padma Ch
>

Re: Cartesian join on RDDs taking too much time

Reply via email to