subject:"Spark join produce duplicate rows in resultset"

Re: Spark join produce duplicate rows in resultset

2023-10-27 Thread Meena Rajani

Thanks all: Patrick selected rev.* and I.* cleared the confusion. The Item actually brought 4 rows hence the final result set had 4 rows. Regards, Meena On Sun, Oct 22, 2023 at 10:13 AM Bjørn Jørgensen wrote: > alos remove the space in rev. scode > > søn. 22. okt. 2023 kl. 19:08 skrev Sadha Ch

Re: Spark join produce duplicate rows in resultset

2023-10-22 Thread Bjørn Jørgensen

alos remove the space in rev. scode søn. 22. okt. 2023 kl. 19:08 skrev Sadha Chilukoori : > Hi Meena, > > I'm asking to clarify, are the *on *& *and* keywords optional in the join > conditions? > > Please try this snippet, and see if it helps > > select rev.* from rev > inner join customer c > on

Re: Spark join produce duplicate rows in resultset

2023-10-22 Thread Sadha Chilukoori

Hi Meena, I'm asking to clarify, are the *on *& *and* keywords optional in the join conditions? Please try this snippet, and see if it helps select rev.* from rev inner join customer c on rev.custumer_id =c.id inner join product p on rev.sys = p.sys and rev.prin = p.prin and rev.scode= p.bcode

Re: Spark join produce duplicate rows in resultset

2023-10-22 Thread Patrick Tucci

Hi Meena, It's not impossible, but it's unlikely that there's a bug in Spark SQL randomly duplicating rows. The most likely explanation is there are more records in the item table that match your sys/custumer_id/scode criteria than you expect. In your original query, try changing select rev.* to

Spark join produce duplicate rows in resultset

2023-10-21 Thread Meena Rajani

Hello all: I am using spark sql to join two tables. To my surprise I am getting redundant rows. What could be the cause. select rev.* from rev inner join customer c on rev.custumer_id =c.id inner join product p rev.sys = p.sys rev.prin = p.prin rev.scode= p.bcode left join item I on rev.sys = i

Re: Spark join produce duplicate rows in resultset

Re: Spark join produce duplicate rows in resultset

Re: Spark join produce duplicate rows in resultset

Re: Spark join produce duplicate rows in resultset

Spark join produce duplicate rows in resultset

5 matches

Site Navigation

Mail list logo

Footer information