RE: Best way to process lookup ETL with Dataframes

2017-01-04 Thread Sesterhenn, Mike
ght? Thanks, -Mike From: Nicholas Hakobian [mailto:nicholas.hakob...@rallyhealth.com] Sent: Friday, December 30, 2016 5:50 PM To: Sesterhenn, Mike Cc: ayan guha; user@spark.apache.org Subject: Re: Best way to process lookup ETL with Dataframes Yep, sequential joins is what I have done in the p

Re: Best way to process lookup ETL with Dataframes

2016-12-30 Thread Nicholas Hakobian
t;> >> Reading about the Oracle nvl function, it seems it is similar to the na >> functions. Not sure it will help though, because what I need is to join >> after the first join fails. >> >> -- >> *From:* ayan guha >> *Sent:* Thursday, Decem

Re: Best way to process lookup ETL with Dataframes

2016-12-30 Thread Sesterhenn, Mike
row because bad data will result. Any other thoughts? From: Nicholas Hakobian Sent: Friday, December 30, 2016 2:12:40 PM To: Sesterhenn, Mike Cc: ayan guha; user@spark.apache.org Subject: Re: Best way to process lookup ETL with Dataframes It looks like Sp

Re: Best way to process lookup ETL with Dataframes

2016-12-30 Thread Nicholas Hakobian
-- > *From:* ayan guha > *Sent:* Thursday, December 29, 2016 11:06 PM > *To:* Sesterhenn, Mike > *Cc:* user@spark.apache.org > *Subject:* Re: Best way to process lookup ETL with Dataframes > > How about this - > > select a.*, nvl(b.col,nvl(c.col,'

Re: Best way to process lookup ETL with Dataframes

2016-12-30 Thread Sesterhenn, Mike
hat I need is to join after the first join fails. From: ayan guha Sent: Thursday, December 29, 2016 11:06 PM To: Sesterhenn, Mike Cc: user@spark.apache.org Subject: Re: Best way to process lookup ETL with Dataframes How about this - select a.*, nvl(b.col,nvl(

Re: Best way to process lookup ETL with Dataframes

2016-12-29 Thread ayan guha
How about this - select a.*, nvl(b.col,nvl(c.col,'some default')) from driving_table a left outer join lookup1 b on a.id=b.id left outer join lookup2 c on a.id=c,id ? On Fri, Dec 30, 2016 at 9:55 AM, Sesterhenn, Mike wrote: > Hi all, > > > I'm writing an ETL process with Spark 1.5, and I was w