Thanks to both of you! You solved the problem. Thanks Erik Stensrud
Sendt fra min iPhone Den 16. jun. 2015 kl. 20.23 skrev Guru Medasani <gdm...@gmail.com<mailto:gdm...@gmail.com>>: Hi Esten, Looks like your sqlContext is connected to a Hadoop/Spark cluster, but the file path you specified is local?. mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json”, Error below shows that the Input path you specified does not exist on the cluster. Pointing to the right hdfs path should be able to help here. Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json Guru Medasani gdm...@gmail.com<mailto:gdm...@gmail.com> On Jun 16, 2015, at 10:39 AM, Shivaram Venkataraman <shiva...@eecs.berkeley.edu<mailto:shiva...@eecs.berkeley.edu>> wrote: The error you are running into is that the input file does not exist -- You can see it from the following line "Input path does not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json" Thanks Shivaram On Tue, Jun 16, 2015 at 1:55 AM, esten <erik.stens...@dnvgl.com<mailto:erik.stens...@dnvgl.com>> wrote: Hi, In SparkR shell, I invoke: > mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json", > header="false") I have tried various filetypes (csv, txt), all fail. RESPONSE: "ERROR RBackendHandler: load on 1 failed" BELOW THE WHOLE RESPONSE: 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(177600) called with curMem=0, maxMem=278302556 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 173.4 KB, free 265.2 MB) 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(16545) called with curMem=177600, maxMem=278302556 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 16.2 KB, free 265.2 MB) 15/06/16 08:09:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:37142 (size: 16.2 KB, free: 265.4 MB) 15/06/16 08:09:13 INFO SparkContext: Created broadcast 0 from load at NativeMethodAccessorImpl.java:-2 15/06/16 08:09:16 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 15/06/16 08:09:17 ERROR RBackendHandler: load on 1 failed java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) at org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1069) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109) at org.apache.spark.rdd.RDD.withScope(RDD.scala:286) at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1067) at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58) at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139) at org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137) at org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137) at org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120) at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230) ... 25 more Error: returnStatus == 0 is not TRUE -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-1-4-0-read-df-function-fails-tp23333.html Sent from the Apache Spark User List mailing list archive at Nabble.com<http://Nabble.com>. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org> ************************************************************************************** This e-mail and any attachments thereto may contain confidential information and/or information protected by intellectual property rights for the exclusive attention of the intended addressees named above. If you have received this transmission in error, please immediately notify the sender by return e-mail and delete this message and its attachments. Unauthorized use, copying or further full or partial distribution of this e-mail or its contents is prohibited. **************************************************************************************