Can you please quantify the difference and provide the query code?

On Fri, Mar 29, 2019 at 9:11 AM neeraj bhadani <bhadani.neeraj...@gmail.com>
wrote:

> Hi Team,
>    I am executing same spark code using the Spark SQL API and DataFrame
> API, however, Spark SQL is taking longer than expected.
>
> PFB Sudo code.
>
> -----------------------------------------------------------------------------------------------
>
> Case 1 : Spark SQL
>
>
> -----------------------------------------------------------------------------------------------
>
> %sql
>
> CREATE TABLE <tbl_name>
>
> AS
>
>
>  WITH <table_1> AS (
>
>      <qry1>
>
> )
>
> ,<table_2> AS (
>
>      <qry2>
>
>      )
>
>
> SELECT * FROM <table_1>
>
> UNION ALL
>
> SELECT * FROM <table_2>
>
>
>
> -----------------------------------------------------------------------------------------------
>
> Case  2 : DataFrame API
>
>
> -----------------------------------------------------------------------------------------------
>
>
> df1 = spark.sql(<qry1>)
>
> df2 = spark.sql(<qry2>)
>
> df3 = df1.union(df2)
>
> df3.write.saveAsTable(<table_name>)
>
>
> -----------------------------------------------------------------------------------------------
>
>
> As per my understanding, both Spark SQL and DtaaFrame API generate the
> same code under the hood and execution time has to be similar.
>
>
> Regards,
>
> Neeraj
>
>
>

-- 
Thanks,
Jason

Reply via email to