There was a bug in the SparkContext creation that I fixed yesterday. https://github.com/apache/spark/commit/6b44278ef7cd2a278dfa67e8393ef30775c72726
If you build from master it should be fixed. Also I think we might have a rc4 which should have this Thanks Shivaram On Tue, Jun 2, 2015 at 12:56 PM, Eskilson,Aleksander < alek.eskil...@cerner.com> wrote: > Hey, that’s pretty convenient. Unfortunately, although the package seems > to pull fine into the session, I’m getting class not found exceptions with: > > Caused by: org.apache.spark.SparkExcetion: Job aborted due to stage > failure: Task 0 in stage 6.0 failed 4 times, most recent failure: Lost task > 0.3 in stage 6.0: java.lang.ClassNotFoundException: > com.databricks.spark.csv.CsvRelation$anonfun$buildScan$1 > > Which smells like a path issue to me, and I made sure the ivy repo was > part of my PATH, but functions like showDF() still fail with that error. > Did I miss a setting, or should the package inclusion in the sparkR > execution load that in? > > I’ve run > df <- read.df(sqlCtx, “./data.csv”, “com.databricks.spark.csv”, > header=“true”, delimiter=“|”) > showDF(df, 10) > > (my data is pipeline delimited, and the default SQL context is sqlCtx) > > Thanks, > Alek > > From: Shivaram Venkataraman <shiva...@eecs.berkeley.edu> > Reply-To: "shiva...@eecs.berkeley.edu" <shiva...@eecs.berkeley.edu> > Date: Tuesday, June 2, 2015 at 2:08 PM > To: Burak Yavuz <brk...@gmail.com> > Cc: Aleksander Eskilson <alek.eskil...@cerner.com>, "dev@spark.apache.org" > <dev@spark.apache.org>, Shivaram Venkataraman <shiva...@eecs.berkeley.edu> > Subject: Re: CSV Support in SparkR > > Hi Alek > > As Burak said, you can already use the spark-csv with SparkR in the 1.4 > release. So right now I use it with something like this > > # Launch SparkR > ./bin/sparkR --packages com.databricks:spark-csv_2.10:1.0.3 > df <- read.df(sqlContext, "./nycflights13.csv", > "com.databricks.spark.csv", header="true") > > You can also pass in other options to the spark csv as arguments to > `read.df`. Let us know if this works > > Thanks > Shivaram > > > On Tue, Jun 2, 2015 at 12:03 PM, Burak Yavuz <brk...@gmail.com> wrote: > >> Hi, >> >> cc'ing Shivaram here, because he worked on this yesterday. >> >> If I'm not mistaken, you can use the following workflow: >> ```./bin/sparkR --packages com.databricks:spark-csv_2.10:1.0.3``` >> >> and then >> >> ```df <- read.df(sqlContext, "/data", "csv", header = "true")``` >> >> Best, >> Burak >> >> On Tue, Jun 2, 2015 at 11:52 AM, Eskilson,Aleksander < >> alek.eskil...@cerner.com> wrote: >> >>> Are there any intentions to provide first class support for CSV files >>> as one of the loadable file types in SparkR? Data brick’s spark-csv API [1] >>> has support for SQL, Python, and Java/Scala, and implements most of the >>> arguments of R’s read.table API [2], but currently there is no way to load >>> CSV data in SparkR (1.4.0) besides separating our headers from the data, >>> loading into an RDD, splitting by our delimiter, and then converting to a >>> SparkR Data Frame with a vector of the columns gathered from the header. >>> >>> Regards, >>> Alek Eskilson >>> >>> [1] -- https://github.com/databricks/spark-csv >>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_databricks_spark-2Dcsv&d=AwMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=0vZw1rBdgaYvDJYLyKglbrax9kvQfRPdzxLUyWSyxPM&m=mPtlFYdyx5Rp7pZr-bQ15QMIrq4qE26ECfJCzoMwYhI&s=wT5PU54lVmR2R_o3GidPhDQD9kMMNVYotZEqCd4ASm4&e=> >>> [2] -- http://www.inside-r.org/r-doc/utils/read.table >>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.inside-2Dr.org_r-2Ddoc_utils_read.table&d=AwMFaQ&c=NRtzTzKNaCCmhN_9N2YJR-XrNU1huIgYP99yDsEzaJo&r=0vZw1rBdgaYvDJYLyKglbrax9kvQfRPdzxLUyWSyxPM&m=mPtlFYdyx5Rp7pZr-bQ15QMIrq4qE26ECfJCzoMwYhI&s=h87nnmV5D3soOFo5wasj1J34zbhvukHd1WcSitsjB6s&e=> >>> CONFIDENTIALITY NOTICE This message and any included attachments are >>> from Cerner Corporation and are intended only for the addressee. The >>> information contained in this message is confidential and may constitute >>> inside or non-public information under international, federal, or state >>> securities laws. Unauthorized forwarding, printing, copying, distribution, >>> or use of such information is strictly prohibited and may be unlawful. If >>> you are not the addressee, please promptly delete this message and notify >>> the sender of the delivery error by e-mail or you may call Cerner's >>> corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024. >>> >> >> >