You can use spark dataframe 'when' 'otherwise' clause to replace SQL case statement.
This piece will be required to calculate before - 'select student_id from tbl_student where candidate_id = c.candidate_id and approval_id = 2 and academic_start_date is null' Take the count of above DF after joining tbl_student and tbl_network DF's based on condition above. Overall you can join all three tables first and rest of the query on the same dataframe. On Sat, Sep 22, 2018 at 1:08 AM Chetan Khatri <chetan.opensou...@gmail.com> wrote: > Dear Spark Users, > > I came across little weird MSSQL Query to replace with Spark and I am like > no clue how to do it in an efficient way with Scala + SparkSQL. Can someone > please throw light. I can create view of DataFrame and do it as *spark.sql > *(query) but I would like to do it with Scala + Spark way. > > Sample: > > > > > > > > > > > > *select a.student_id,a.candidate_id, a.student_name, a.student_standard, > a.student_city, b.teacher_name, a.student_status ,a.approval_id, case when > a.approval_id = 2 and (a.academic_start_date is nulland not exists (select > student_id from tbl_student where candidate_id = c.candidate_id and > approval_id = 2and academic_start_date is null)) then 'Yes'else 'No'end as > is_currentfrom tbl_student a inner join tbl_teacher b on a.candidate_id = > b.candidate_id inner join tbl_network con c.candidate_id = a.candidate_id* > > Thank you. > >