Hi, have you tried to use spark-csv (https://github.com/databricks/spark-csv) ? after all you can reconduct an XL file to CSV.... hth.
On Thu, Jul 21, 2016 at 4:25 AM, Felix Cheung <felixcheun...@hotmail.com> wrote: > From looking at be CLConnect package, its loadWorkbook() function only > supports reading from local file path, so you might need a way to call HDFS > command to get the file from HDFS first. > > SparkR currently does not support this - you could read it in as a text > file (I don't think .xlsx is a text format though), collect to get all the > data at the driver, then save to local path perhaps? > > > > > > On Wed, Jul 20, 2016 at 3:48 AM -0700, "Rabin Banerjee" < > dev.rabin.baner...@gmail.com> wrote: > > Hi Yogesh , > > I have never tried reading XLS files using Spark . But I think you can > use sc.wholeTextFiles to read the complete xls at once , as xls files > are xml internally, you need to read them all to parse . Then I think you > can use apache poi to read them . > > Also, you can copy you XLS data to a MS-Access file to access via JDBC , > > Regards, > Rabin Banerjee > > On Wed, Jul 20, 2016 at 2:12 PM, Yogesh Vyas <informy...@gmail.com> wrote: > >> Hi, >> >> I am trying to load and read excel sheets from HDFS in sparkR using >> XLConnect package. >> Can anyone help me in finding out how to read xls files from HDFS in >> sparkR ? >> >> Regards, >> Yogesh >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> >