>From looking at be CLConnect package, its loadWorkbook() function only 
>supports reading from local file path, so you might need a way to call HDFS 
>command to get the file from HDFS first.

SparkR currently does not support this - you could read it in as a text file (I 
don't think .xlsx is a text format though), collect to get all the data at the 
driver, then save to local path perhaps?





On Wed, Jul 20, 2016 at 3:48 AM -0700, "Rabin Banerjee" 
<dev.rabin.baner...@gmail.com<mailto:dev.rabin.baner...@gmail.com>> wrote:

Hi Yogesh ,

  I have never tried reading XLS files using Spark . But I think you can use 
sc.wholeTextFiles  to read the complete xls at once , as xls files are xml 
internally, you need to read them all to parse . Then I think you can use 
apache poi to read them .

Also, you can copy you XLS data to a MS-Access file to access via JDBC ,

Regards,
Rabin Banerjee

On Wed, Jul 20, 2016 at 2:12 PM, Yogesh Vyas 
<informy...@gmail.com<mailto:informy...@gmail.com>> wrote:
Hi,

I am trying to load and read excel sheets from HDFS in sparkR using
XLConnect package.
Can anyone help me in finding out how to read xls files from HDFS in sparkR ?

Regards,
Yogesh

---------------------------------------------------------------------
To unsubscribe e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>


Reply via email to