I have 2 questions related to pyspark: 1. How do I load an avro file that is on the local filesystem as opposed to hadoop? I tried the following and I just get NullPointerExceptions:
avro_rdd = sc.newAPIHadoopFile( "file:///c:/my-file.avro", "org.apache.avro.mapreduce.AvroKeyInputFormat", "org.apache.avro.mapred.AvroKey", "org.apache.hadoop.io.NullWritable", keyConverter="org.apache.spark.examples.pythonconverters.AvroWrapperToJavaConverter", conf=None) 2. If I have a stream of bytes with the avro "avrobytes", is there a way I can create a spark context from it? Let me know if either of the two above is possible, and if so, how. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-load-avro-file-into-spark-not-on-Hadoop-in-pyspark-tp22480.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org