cally
> pulling
> >>>>>> out
> >>>>>> selected columns from the query, but there is no roll up happening
> or
> >>>>>> anything that would possible make it suspicious that there is any
> >>>>>> difference
> &
matched based on three keys
>>>>>> that
>>>>>> are present in both tables (ad, account, and date), on top of this
>>>>>> they are
>>>>>> filtered by date being above 2016-01-03. Since all the joins are
>>>
> > difference
> >>>> > besides the type of joins. The tables are matched based on three
> keys
> >>>> > that
> >>>> > are present in both tables (ad, account, and date), on top of this
> >>>> > they are
> >>>
mo_lt where date
>>>> >>='2016-01-03'").count
>>>> >
>>>> > res14: Long = 34158
>>>> >
>>>> > scala> sqlContext.sql("select * from dps_pin_promo_lt where date
>>>> >>='2016-01-03
res15: Long = 42693
>>> >
>>> >
>>> > The above two queries filter out the data based on date used by the
>>> joins of
>>> > 2016-01-03 and you can see the row count between the two tables are
>>> > different, which is why I am suspecting something is wrong w
joins in spark sql, because in this situation the right and outer joins
>> may
>> > produce the same results, but it should not be equal to the left join
>> and
>> > definitely not the inner join; unless I am missing something.
>> >
>> >
>
l value is res16: Long = 42694
> >
> >
> > Thanks,
> >
> >
> > KP
> >
> >
> >
> >
> > On Mon, May 2, 2016 at 12:50 PM, Yong Zhang
> wrote:
> >>
> >> We are still not sure what is the problem, if you cannot show u
ultSet row count as dps right outer join
>> with swig on 3 columns, with same additional filters.
>>
>> Without knowing your data, I cannot see the reason that has to be a bug in
>> the spark.
>>
>> Am I misunderstanding your bug?
>>
>> Yong
>>
>> ___
nowing your data, I cannot see the reason that has to be a bug
>> in the spark.
>>
>> Am I misunderstanding your bug?
>>
>> Yong
>>
>> --
>> From: kpe...@gmail.com
>> Date: Mon, 2 May 2016 12:11:18 -0700
>>
ter join
> with swig on 3 columns, with same additional filters.
>
> Without knowing your data, I cannot see the reason that has to be a bug in
> the spark.
>
> Am I misunderstanding your bug?
>
> Yong
>
> --------------
> From: kpe...@gmail.com
>
: Mon, 2 May 2016 12:11:18 -0700
Subject: Re: Weird results with Spark SQL Outer joins
To: gourav.sengu...@gmail.com
CC: user@spark.apache.org
Gourav,
I wish that was case, but I have done a select count on each of the two tables
individually and they return back different number of rows
Gourav,
I wish that was case, but I have done a select count on each of the two
tables individually and they return back different number of rows:
dps.registerTempTable("dps_pin_promo_lt")
swig.registerTempTable("swig_pin_promo_lt")
dps.count()
RESULT: 42632
swig.count()
RESULT: 42034
On
This shows that both the tables have matching records and no mismatches.
Therefore obviously you have the same results irrespective of whether you
use right or left join.
I think that there is no problem here, unless I am missing something.
Regards,
Gourav
On Mon, May 2, 2016 at 7:48 PM, kpeng1
Also, the results of the inner query produced the same results:
sqlContext.sql("SELECT s.date AS edate , s.account AS s_acc , d.account AS
d_acc , s.ad as s_ad , d.ad as d_ad , s.spend AS s_spend ,
d.spend_in_dollar AS d_spend FROM swig_pin_promo_lt s INNER JOIN
dps_pin_promo_lt d ON (s.date
Hi Kevin,
Thanks.
Please post the result of the same query with INNER JOIN and then it will
give us a bit of insight.
Regards,
Gourav
On Mon, May 2, 2016 at 7:10 PM, Kevin Peng wrote:
> Gourav,
>
> Apologies. I edited my post with this information:
> Spark version: 1.6
> Result from spark s
Gourav,
Apologies. I edited my post with this information:
Spark version: 1.6
Result from spark shell
OS: Linux version 2.6.32-431.20.3.el6.x86_64 (
mockbu...@c6b9.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat
4.4.7-4) (GCC) ) #1 SMP Thu Jun 19 21:14:45 UTC 2014
Thanks,
KP
On Mon,
Hi,
As always, can you please write down details regarding your SPARK cluster -
the version, OS, IDE used, etc?
Regards,
Gourav Sengupta
On Mon, May 2, 2016 at 5:58 PM, kpeng1 wrote:
> Hi All,
>
> I am running into a weird result with Spark SQL Outer joins. The results
> for all of them seem
17 matches
Mail list logo