Hi again, anyone in this group tried to access SAS dataset through Spark
SQL ? Thank you

Regards,
Ajay

On Friday, June 10, 2016, Ajay Chander <[email protected]> wrote:

> Hi Spark Users,
>
> I hope everyone here are doing great.
>
> I am trying to read data from SAS through Spark SQL and write into HDFS.
> Initially, I started with pure java program please find the program and
> logs in the attached file sas_pure_java.txt . My program ran successfully
> and it returned the data from Sas to Spark_SQL. Please note the
> highlighted part in the log.
>
> My SAS dataset has 4 rows,
>
> Program ran successfully. So my output is,
>
> [2016-06-10 10:35:21,584] INFO stmt(1.1)#executeQuery SELECT
> a.sr_no,a.start_dt,a.end_dt FROM sasLib.run_control a; created result set
> 1.1.1; time= 0.122 secs (com.sas.rio.MVAStatement:590)
>
> [2016-06-10 10:35:21,630] INFO rs(1.1.1)#next (first call to next); time=
> 0.045 secs (com.sas.rio.MVAResultSet:773)
>
> 1,'2016-01-01','2016-01-31'
>
> 2,'2016-02-01','2016-02-29'
>
> 3,'2016-03-01','2016-03-31'
>
> 4,'2016-04-01','2016-04-30'
>
>
> Please find the full logs attached to this email in file sas_pure_java.txt.
>
> _______________________
>
>
> Now I am trying to do the same via Spark SQL. Please find my program and
> logs attached to this email in file sas_spark_sql.txt .
>
> Connection to SAS dataset is established successfully. But please note
> the highlighted log below.
>
> [2016-06-10 10:29:05,834] INFO conn(2)#prepareStatement sql=SELECT
> "SR_NO","start_dt","end_dt" FROM sasLib.run_control ; prepared statement
> 2.1; time= 0.038 secs (com.sas.rio.MVAConnection:538)
>
> [2016-06-10 10:29:05,935] INFO ps(2.1)#executeQuery SELECT
> "SR_NO","start_dt","end_dt" FROM sasLib.run_control ; created result set
> 2.1.1; time= 0.102 secs (com.sas.rio.MVAStatement:590)
> Please find the full logs attached to this email in file
>  sas_spark_sql.txt
>
> I am using same driver in both pure java and spark sql programs. But the
> query generated in spark sql has quotes around the column names(Highlighted
> above).
> So my resulting output for that query is like this,
>
> +-----+--------+------+
> |  _c0|     _c1|   _c2|
> +-----+--------+------+
> |SR_NO|start_dt|end_dt|
> |SR_NO|start_dt|end_dt|
> |SR_NO|start_dt|end_dt|
> |SR_NO|start_dt|end_dt|
> +-----+--------+------+
>
> Since both programs are using the same driver com.sas.rio.MVADriver .
> Expected output should be same as my pure java programs output. But
> something else is happening behind the scenes.
>
> Any insights on this issue. Thanks for your time.
>
>
> Regards,
>
> Ajay
>

Reply via email to