Please find the code below. sankar2 <- read.df("/nfspartition/sankar/test/2016/08/test.json")
I tried these two commands. write.df(sankar2,"/nfspartition/sankar/test/test.csv","csv",header="true") saveDF(sankar2,"sankartest.csv",source="csv",mode="append",schema="true") On Tue, Sep 20, 2016 at 9:40 PM, Kevin Mellott <kevin.r.mell...@gmail.com> wrote: > Can you please post the line of code that is doing the df.write command? > > On Tue, Sep 20, 2016 at 9:29 AM, Sankar Mittapally <sankar.mittapally@ > creditvidya.com> wrote: > >> Hey Kevin, >> >> It is a empty directory, It is able to write part files to the directory >> but while merging those part files we are getting above error. >> >> Regards >> >> >> On Tue, Sep 20, 2016 at 7:46 PM, Kevin Mellott <kevin.r.mell...@gmail.com >> > wrote: >> >>> Have you checked to see if any files already exist at >>> /nfspartition/sankar/banking_l1_v2.csv? If so, you will need to delete >>> them before attempting to save your DataFrame to that location. >>> Alternatively, you may be able to specify the "mode" setting of the >>> df.write operation to "overwrite", depending on the version of Spark you >>> are running. >>> >>> *ERROR (from log)* >>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>> _201609170802_0013_m_000000/.part-r-00000-46a7f178-2490-444e >>> -9110-510978eaaecb.csv.crc]: >>> it still exists. >>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>> _201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444e- >>> 9110-510978eaaecb.csv]: >>> it still exists. >>> >>> *df.write Documentation* >>> http://spark.apache.org/docs/latest/api/R/write.df.html >>> >>> Thanks, >>> Kevin >>> >>> On Tue, Sep 20, 2016 at 12:16 AM, sankarmittapally < >>> sankar.mittapa...@creditvidya.com> wrote: >>> >>>> We have setup a spark cluster which is on NFS shared storage, there is >>>> no >>>> permission issues with NFS storage, all the users are able to write to >>>> NFS >>>> storage. When I fired write.df command in SparkR, I am getting below. >>>> Can >>>> some one please help me to fix this issue. >>>> >>>> >>>> 16/09/17 08:03:28 ERROR InsertIntoHadoopFsRelationCommand: Aborting >>>> job. >>>> java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus >>>> {path=file:/nfspartition/sankar/banking_l1_v2.csv/_temporary >>>> /0/task_201609170802_0013_m_000000/part-r-00000-46a7f178-249 >>>> 0-444e-9110-510978eaaecb.csv; >>>> isDirectory=false; length=436486316; replication=1; blocksize=33554432; >>>> modification_time=1474099400000; access_time=0; owner=; group=; >>>> permission=rw-rw-rw-; isSymlink=false} >>>> to >>>> file:/nfspartition/sankar/banking_l1_v2.csv/part-r-00000-46a >>>> 7f178-2490-444e-9110-510978eaaecb.csv >>>> at >>>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.m >>>> ergePaths(FileOutputCommitter.java:371) >>>> at >>>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.m >>>> ergePaths(FileOutputCommitter.java:384) >>>> at >>>> org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.c >>>> ommitJob(FileOutputCommitter.java:326) >>>> at >>>> org.apache.spark.sql.execution.datasources.BaseWriterContain >>>> er.commitJob(WriterContainer.scala:222) >>>> at >>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>> sRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoo >>>> pFsRelationCommand.scala:144) >>>> at >>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>>> tionCommand.scala:115) >>>> at >>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>>> tionCommand.scala:115) >>>> at >>>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutio >>>> nId(SQLExecution.scala:57) >>>> at >>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>> sRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115) >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>>> ideEffectResult$lzycompute(commands.scala:60) >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>>> ideEffectResult(commands.scala:58) >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.d >>>> oExecute(commands.scala:74) >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1. >>>> apply(SparkPlan.scala:115) >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1. >>>> apply(SparkPlan.scala:115) >>>> at >>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQue >>>> ry$1.apply(SparkPlan.scala:136) >>>> at >>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperati >>>> onScope.scala:151) >>>> at >>>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkP >>>> lan.scala:133) >>>> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.s >>>> cala:114) >>>> at >>>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompu >>>> te(QueryExecution.scala:86) >>>> at >>>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExe >>>> cution.scala:86) >>>> at >>>> org.apache.spark.sql.execution.datasources.DataSource.write( >>>> DataSource.scala:487) >>>> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211) >>>> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:194) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce >>>> ssorImpl.java:62) >>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe >>>> thodAccessorImpl.java:43) >>>> at java.lang.reflect.Method.invoke(Method.java:498) >>>> at >>>> org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBac >>>> kendHandler.scala:141) >>>> at >>>> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackend >>>> Handler.scala:86) >>>> at >>>> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackend >>>> Handler.scala:38) >>>> at >>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(Sim >>>> pleChannelInboundHandler.java:105) >>>> at >>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>>> Read(AbstractChannelHandlerContext.java:308) >>>> at >>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>>> ad(AbstractChannelHandlerContext.java:294) >>>> at >>>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(M >>>> essageToMessageDecoder.java:103) >>>> at >>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>>> Read(AbstractChannelHandlerContext.java:308) >>>> at >>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>>> ad(AbstractChannelHandlerContext.java:294) >>>> at >>>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(Byte >>>> ToMessageDecoder.java:244) >>>> at >>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannel >>>> Read(AbstractChannelHandlerContext.java:308) >>>> at >>>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRe >>>> ad(AbstractChannelHandlerContext.java:294) >>>> at >>>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(Defa >>>> ultChannelPipeline.java:846) >>>> at >>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.re >>>> ad(AbstractNioByteChannel.java:131) >>>> at >>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEven >>>> tLoop.java:511) >>>> at >>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimiz >>>> ed(NioEventLoop.java:468) >>>> at >>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEve >>>> ntLoop.java:382) >>>> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) >>>> at >>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(Sin >>>> gleThreadEventExecutor.java:111) >>>> at >>>> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnabl >>>> eDecorator.run(DefaultThreadFactory.java:137) >>>> at java.lang.Thread.run(Thread.java:745) >>>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>>> _201609170802_0013_m_000000/.part-r-00000-46a7f178-2490-444e >>>> -9110-510978eaaecb.csv.crc]: >>>> it still exists. >>>> 16/09/17 08:03:28 WARN FileUtil: Failed to delete file or >>>> dir[/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task >>>> _201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444e- >>>> 9110-510978eaaecb.csv]: >>>> it still exists. >>>> 16/09/17 08:03:28 ERROR DefaultWriterContainer: Job >>>> job_201609170803_0000 >>>> aborted. >>>> 16/09/17 08:03:28 ERROR RBackendHandler: save on 625 failed >>>> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : >>>> org.apache.spark.SparkException: Job aborted. >>>> at >>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>> sRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoo >>>> pFsRelationCommand.scala:149) >>>> at >>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>>> tionCommand.scala:115) >>>> at >>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>> sRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRela >>>> tionCommand.scala:115) >>>> at >>>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutio >>>> nId(SQLExecution.scala:57) >>>> at >>>> org.apache.spark.sql.execution.datasources.InsertIntoHadoopF >>>> sRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115) >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>>> ideEffectResult$lzycompute(commands.scala:60) >>>> at >>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.s >>>> ideEffectResult(commands.scala:58) >>>> at org.apache.spark.sql.execution.command.ExecutedCommandExec.doE >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: http://apache-spark-user-list. >>>> 1001560.n3.nabble.com/write-df-is-failing-on-Spark-Cluster-tp27761.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>> >>>> >>> >> >> >> -- >> Regards >> >> Sankar Mittapally >> Senior Software Engineer >> > > -- Regards Sankar Mittapally Senior Software Engineer